Posted by: Paul Hewitt | November 6, 2009

The Future of Prediction Markets – Part II

As a followup to my previous post, this one covers Public prediction markets.  Up front, I have to admit that my interest in public prediction markets is minimal, mainly because I see very little potential for these types of markets to improve decision-making (public or private).  If they are unable to do this, what good are they?  I started writing this post in May, just after I completed my post on the future of enterprise prediction markets.  Instead of completing this post, I published posts on noteworthy failures of public prediction markets and about market calibration.

Recently, Chris Masse, on his Midas Oracle site, documented the very public failures of prediction markets to forecast the IOC’s eventual decision to hold the 2016 Olympics in Rio and to forecast the winner of the Nobel prize in Economics (or any of the other Nobel prizes, for that matter).  I made several comments on Midas Oracle about these failed markets, and the process has renewed my interest (ever so slightly) in public prediction markets.  Here is the result.

Is there a future for Public Prediction Markets?

Bet on it.  In fact, you may have to.  Exchanges, such as Betfair and InTrade, may be the only sustainable, profitable applications of prediction markets that are available to the public.  Let’s face it, people love to bet on uncertain outcomes.  Even when the odds are against them, people will try to beat the house.  In casinos, the odds are always against the bettor, yet there is no shortage of gamblers and the casinos become glitzier each year.  It’s no mystery where the money is coming from.

Internet-based prediction markets offer the public the convenience of betting at home.  They have the potential to greatly expand the variety and types of things on which wagers may be placed, from political races to trivial events, such as who might win the latest “star search” or who is the best dancer”.   By adding to the variety of betting options, it expands the potential market for bettors.

Take away the real money component, and these prediction markets become nothing more than trivial pursuits.  Hubdub is a good example of a play money marketplace.  While it appears to be well-run, its use for anything other than “entertainment” is questionable.  Eventually, public prediction markets like these will fade away as newer fads invade the consciousness of the play money, esteem-seeking, public bettors.

There is some potential for real (serious) money prediction markets that might provide investors with a hedging mechanism against future events for which there may not be any form of insurance.  For example, a company could hedge against the risk of a particular piece of legislation becoming law (and having adverse effects on the company).

While there is a glimmer of hope that the U.S. anti-gambling laws may be relaxed in the future to allow real money prediction markets, the amounts that may be wagered are likely to be too small to attract any investors who wish to hedge against an uncertain event.  The betting limits will, however, provide a sufficient opening to allow betting exchanges to reach a vast new market in predictions.

Is there any real value in Public Prediction Markets?

Since public prediction markets operate in the same manner as enterprise markets, we can learn more about how these markets work and what makes them work well, by analysing the much more prevalent public prediction markets.  We can learn which types of markets tend to work well and which do not.  This may be useful in identifying appropriate uses for Enterprise Prediction Markets.  We could test public prediction markets to determine their consistency (or lack thereof).  We could make incremental changes to the markets to assess the effects on accuracy, consistency and the potential length of forecasting ability.

We could learn much about the role of information completeness by monitoring the information sets of market participants and comparing markets with similar participants but having differing information sets.  This may lead to insights about using prediction markets to replace some of the costly components of enterprise forecasting processes.  For example, if a public prediction market is able to more accurately (and consistently) forecast key components of an enterprise’s annual budget than the internal corporate methods, it may be possible to improve the efficiency of the planning process.  There may be additional benefits from engaging the enterprise’s customer base in the decision-making process, too. 

Apart from the knowledge gained from operating public prediction markets, one is hard pressed to find any significant benefit of these markets.  Do they help allocate resources to their best uses?  This may be a possible benefit, if the results of certain prediction markets are used to help shape public policy.  But, prediction markets are unproven in their abilities to consistently and accurately forecast or predict future outcomes and events.  Until they overcome these substantial limitations, their use for anything other than trivial pursuits will be rare.

Posted by: Paul Hewitt | October 20, 2009

More Public Prediction Market Failures

Recently, there have been several very glaring public prediction market failures, including the IOC site selection and the Economics Nobel Prize markets.  Some followers of prediction markets are a bit shocked and concerned, but most, like Chris Masse (Midas Oracle), me, and others are not.  These particular types of prediction markets never had a chance to be accurate.  Had any of these markets actually managed to “pick” the right outcome, it would have been nothing more than a fluke.  Why we continue to waste our time on these types of markets, I’ll never understand.

Jed Christiansen (Mercury’s Blog) is an occasional commenter on Midas Oracle.  I may not always agree with him, but I respect his positions in a number of areas.  However, in response to these very public failures of prediction markets, Jed provided a number of factors that influence the accuracy of prediction markets.  It appears that his comments apply only to outcomes that are determined by a group.  Essentially, he means outcomes that are determined through some form of voting or polling, including elections, IOC site selection, Academy Awards, Nobel Prizes, etc…  While I applaud his efforts to identify the factors affecting prediction market accuracy, I find some of his comments confusing.

For example, Jed mentions that “more members/voters will be better than fewer” (in terms of improving the accuracy of prediction markets).  In these types of markets, the members/voters are determining the actual outcome.  This is entirely independent of a prediction market attempting to predict that same outcome.  Consequently, having more members involved in determining the actual outcome will have no effect, whatsoever, on the accuracy of any related prediction market.  Jed’s comment makes no sense.

Jed is absolutely correct to say that “more objective criteria will be better than less.”  However, all this means is that the more objective the determinants of the outcome, the more likely the market participants will be able to figure them out and predict the outcome.  The fewer the factors and the less uncertainty surrounding their roles in determining the outcome, the easier it will be to predict the actual outcome.  In the extreme, if a condition arises that determines (or causes) the future outcome with a high degree of certainty, the market will be able to predict with uncanny precision.  However,  if the outcome is this easily predicted, perhaps a simple decision model (If… Then…) would have provided the same “prediction”, without the bother of setting up a prediction market.

Generally, I would agree with Jed that “constrained choices will be better than unconstrained choices.”  In keeping with this statement, the fewer the choices, the more likely it is that the outcome will be predictable (only because there are fewer incorrect options)!  However, the IOC markets showed that, even with only four choices, the markets failed.  The real problem is that these markets did not have the necessary information to choose among even a very small number of alternatives.

Again, I agree with Jed that “voters signalling choices before a vote is better than if they don’t.”  Where the outcome is determined by a vote, any prior information about how some or all of the group intends to vote will be important information to be assessed by the market participants.  This merely supports the information completeness principle.  We see many examples of this type of information being accessed by participants (in the IOWA political prediction markets) where political polling influences the market prices.

Finally, Jed made a curious statement about “secretive and less secretive” committees that make decisions and that “neither will likely be as accurate as traditional open prediction markets.”  I have no idea what he means, here!  The committees (secretive or not) are the ones determining (creating) the actual outcome.  The committee has nothing to do with being “accurate” or predicting the outcome.  Traditional markets are expected to predict actual outcomes.  Jed is simply wrong to try and compare these two concepts!

Panos Ipeirotis asked if there is a more principled method of capturing the determinants of prediction market accuracy.  In response, I would suggest that we look to the first principles of prediction markets.  Perhaps the most important of which is that the market possess a sufficient degree of information completeness.  In the examples noted, the prediction market participants did not have an adequate level of information completeness to be able to arrive at accurate predictions, because the method of determining the outcome was far too complex and subjective, even when the choices were limited to four.

The only way, to provide the necessary information to the prediction market, in order for it to accurately determine the otucome, would have been to make all (or many) of the outcome voters (committee members) participants in the prediction market.  Of course, this would be a needless redundancy.  Note that in most of the enterprise prediction markets, many of the participants also take part in the internal forecasting process, effectively including the body of corporate information in the prediction markets.  If internal forecasting processes were to be replaced by prediction markets, it is highly doubtful that the markets would be able to provide accurate predictions.  The required information to make those accurate predictions would be missing.

These types of markets suffer from a fatal flaw, as well.  They are trying to predict a discrete (non-continuous variable) outcome.  “Coming close” means being completely wrong.  These types of markets are only suitable for betting purposes, and even then, only if they are proven to be “well-calibrated”.  It is questionable whether these particular markets were well-calibrated.

I have written fairly extensively on the determinants of prediction market usefulness.  I am especially concerned with their accuracy and consistency, for without these, their use in decision-making is not warranted.  I draw your attention to the following posts:

The Forgotten Principle Behind Prediction Markets

Calibration = Prediction Market Accuracy?

To answer Panos, we do have a general, principled model for assessing prediction market accuracy.  Now, we need to fill in the details.

Posted by: Paul Hewitt | September 30, 2009

Corporate Prediction Market Success is Elusive

A new study of prediction markets in the corporate world was released, recently.  It’s called Forecasting Consumer Products Using Prediction Markets, by Kai Trepte and Rajaram Narayanaswamy.  Lo and behold, the prediction markets failed to provide any significant improvement in accuracy over that of the traditional corporate forecasting process.  The authors submitted their paper as part of their masters program requirements.  They don’t appear to have been beholden to any software vendor, though they did use the services of Consensus PointToday’s entry will focus on the accuracy and usefulness of the prediction markets that were part of the study.  A subsequent entry will cover other aspects of prediction markets that were discussed by the authors.

The good news is that the authors planned the operation of the markets well, and they used more participants than most studies we have seen.  There appears to have been a conscious effort to maximize the diversity of the participants, but, like most of these studies, many of the prediction market participants also had involvement in the corporate forecasting process.  Consequently, we could pretty much expect that the predictions would be fairly well correlated with the corporate forecasts, and they were.  So, how did they compare? 

The prediction markets weren’t failures, but they weren’t able to do any better than the established corporate forecasting process at General Mills, where 20 prediction markets were put in play.  Despite the efforts of many academics, researchers, vendors and corporations,  the breakthrough success story about enterprise prediction markets remains as elusive as ever. 

FINDINGS & COMMENTARY

Correlation of Predictions and Forecasts

The Mean Absolute Percentage Error (MAPE) of the prediction market and the operations forecast (internal process) were highly correlated.  As mentioned above, this is not surprising, given that many of those involved with the internal forecasting process were also involved with the prediction markets.  Furthermore, the initial probability distribution for the potential outcomes were based on normal distributions around the internally forecasted mean.  That is, the starting point for the prediction market was the corporate forecast.  There were good reasons for doing this, but still, it may have introduced some bias toward the internal forecast.

The authors of the study found that the prediction market forecasts were virtually identical to those of the internal operations forecasting process, as evidenced by their means falling within one standard deviation of each other.  Consequently, we could say that both processes/methods were good aggregators of available information, and any information that was generated internally was also available to the market participants. 

Some Predictions are Better Than Others

The authors included three types of markets:  Volume, Product Category and Promotional markets.  The Volumemarkets were characterized by products that might be considered staples, with fairly stable consumption patterns.  Internal forecasts and market predictions were both able to accurately gauge the future outcome.  Product Category markets were a bit more difficult to predict or forecast, due to the nature of the products and strategies used.  Finally, the Promotional markets, which were characterized by products that had very significant promotions planned, were the most difficult to forecast.  Not even the corporate marketing people were very good at forecasting the effectiveness of the promotional activities.  Again, both the internal forecasts and the market predictions were even less accurate, but still they were basically the same.

It appears that, if it is difficult to analyse data to come up with an accurate forecast, as was the case with the promotional markets, the use of a prediction market will not magically generate the information necessary to make a better prediction.  We have seen this in other studies and examples, where there is a significant amount of uncertainty about the outcome.  This is the information completeness principle that I’ve discussed previously.

Very Short Term Markets

I should note that the prediction markets were in operation for no longer than 10 weeks.  The authors described some of their prediction markets as being “long term”, but in reality, there were anything but.  In our quest for a useful enterprise prediction market, it must be able to generate consistently accurate predictions, sufficiently in advance, so that decision-makers are able to change their tactics, based on the predictions.  In the study’s “longer term” markets, none were able to generate accurate predictions until very near the time when the actual outcome would have been set.  In these cases, management would not have had time to change their tactics or decisions, once the market prediction had become known.  Therefore, even if the prediction had been perfectly accurate, it is completely useless for any decision-making purposes.

Costs vs. Benefits

The authors did not discuss the issue of costs and benefits of prediction markets, but perhaps we should.  Given that both the traditional forecasting process and the prediction markets provided equivalent forecasts, should General Mills’ management scrap their costly forecasting process and adopt these neat new tools?  We can’t know for sure, right now, but if they were to discontinue the internal forecasting process, most of the useful information that needs to be aggregated in the prediction markets would not have been available to the participants.  Accordingly, we would expect the predictions to become very inaccurate. 

It would appear that the accuracy of the prediction markets depends upon the information created by the forecasting process.  If you can’t have prediction markets without the internal forecasting, why would General Mills add prediction markets to the process?  One reason might be to verify the accuracy of the internal forecast, but I’ll bet they already know that, historically, their forecasts are reasonably accurate for their decision-making purposes.  They might consider eliminating the internal aggregation function, while continuing to generate forecasting information.  Prediction markets would be relied upon to perform the aggregation of the information more efficiently.  Finally, prediction markets generate distributions of possible outcomes along with the mean prediction or forecast.  This information can be used to assess the risk and uncertainty surrounding the forecast, enabling management to make better contingency plans.

Filtering Bias

One of the benefits of prediction markets is their ability to filter out bias during the aggregation process.  Consequently, I (and the authors) expected the prediction markets to provide significantly more accurate forecasts than those generated from the internal forecasting process.  The fact that they were not more accurate means, to me, that General Mills’ internal forecasting process performs its function in a reasonably unbiased fashion.  We should be studying why they have been able to minimize bias in their planning!   Another possibility, which I find too scary to contemplate, is that prediction markets aren’t as good at filtering out the bias as we have been led to believe!

Calculating Accuracy

The authors don’t discuss the method of calculating the forecast or prediction error, other than to note that General Mills uses the MAPE (see above) to calculate their own internal forecast errors.  I have a couple of issues with this approach (which was also used in the HP study).  Using the absolute value of the error provides only the magnitude and no information about whether the prediction was an over or under-estimation.  Accordingly, the actual error could be as much as twice the amount of the absolute error quoted.  Also, the authors (and others) use the actual outcome as the denominator in the calculation of the average.  This is incorrect, because it is the forecast (or prediction) value that is being evaluated, rather than the actual outcome.  Management relies upon the prediction in order to make decisions.  They don’t rely on the actual outcome (which isn’t known), when they are making decisions.  Accordingly, the prediction value should be used in the denominator and not the actual outcome.

My next blog entry will cover the authors’ comments about the operation of these prediction markets and how well they appear to aggregate available information.

Adam Smith (1776) detected the “invisible hand” that seemed to be able to efficiently allocate resources in free markets, without the intervention of a central “planner”.  Against the backdrop of the USSR and its planned economy, the neoclassical model provided a substantial improvement in explaining the allocation of scarce resources among competing uses.  And so until very recently, most developed-country governments (and their constituents) embraced this model of economic theory as if it was, in fact, the way economies worked.  Many policies were designed to “free up” markets by eliminating constraints to economic activity.  The objective was always to allow Adam Smith’s “invisible hand” to do its magic for the benefit of all.

Most people who try to understand “economics” do so within the traditional, neoclassical framework that has dominated economic thinking for most of the last century or so. This is the theory that is taught at the introductory and intermediate level economics.  Even those who have yet to study economics have come to embrace this model of economic thought.  They have been swayed by political and cultural institutions,  media and the like to believe that the “machine” just needs to be oiled, gassed up, and the speed limits removed for economic prosperity to come to all.

Until very recently, if you need any proof of this, all you had to do was review media reports and political commentary.  You would get the impression that Adam Smith was the greatest, smartest economist of all time, and that all we needed to do was follow his principles more closely and the economy would flourish.  They were advocating an almost religious belief in the “invisible hand”.  Scary.  Now, with all of the economic problems that have come to light, we are beginning to see a change.  More people are coming to see that the “economy” doesn’t work like the model, and no amount of tinkering will bring back the ideal model world.  Why?

Essentially, the neoclassical framework is a model of the economy that is simple enough to be understood, yet sufficiently robust to be able to describe and explain basic economic phenomena.  The real world is highly complex and likely impossible to model, without making generalizing assumptions about markets and their participants.   Among others, the neoclassical model assumes perfectly competitive markets, firms that always attempt to maximize profits, homogenous households that always attempt to maximize their utility, and perfect information.  In addition, there are no externalities and all markets always clear.  The introduction of a “shock” to a market, results in an immediate jump to a new equilibrium.

Introducing Information Economics

Born out of a disillusionment with the ability of neoclassical models to explain real world economic phenomena, a new paradigm emerged, the role of information imperfections in understanding economic conditions.  Joseph E. Stiglitz, Nobel laureate in 2001, identified information imperfections as one of the main reasons why the neoclassical model failed to explain many real economic conditions.  Where neoclassical models were characterized by a single market clearing price at equilibrium (quantity supplied equals quantity demanded), Stiglitz proved that with imperfect information, not only would markets exhibit a distribution of prices, but an equilibrium may not even exist and markets may not clear.  In extreme cases, caused by information problems, markets may be thin or fail to function at all.  I could go on about the achievements in information economics, but for now, these findings have particular relevance to our study of prediction markets.

Implications for Prediction Markets

Much of the theory behind prediction markets rests on the standard neoclassical model of markets.  Buyers and sellers interact, resulting in a market clearing price, which incorporates all of the available information about the market.  But, from information economics, we find that even small information imperfections can have profound effects on market functioning.  If the neoclassical model often fails to explain real world economic conditions, how can we expect prediction markets, based on the same theory, to explain, or describe, the underlying reality of their markets?

In the real world, information imperfections lead to market price distributions, rather than a single market clearing price.  Similarly, if prediction markets function like other product or asset markets, inevitably, the result will be a distribution of prices.

If prediction markets do not reach an equilibrium, and the market is characterized by a distribution of prices, can we still rely on the market “price” to convey all of the information available to the market?  If so, which price should we use?  How will we know when the market has incorporated enough information for the current price (or distribution) to be “accurate”?

I’m afraid I don’t have the answers to these questions.  Perhaps a group of information economists will join the discussion.

Posted by: Paul Hewitt | June 11, 2009

Why Public Prediction Markets Fail

Public prediction markets have the potential to be more accurate than enterprise markets, because they may be able to attract a larger “crowd” of traders, who may be able to aggregate a more complete information set, and the markets may be more efficient.  However, they often fail to predict actual outcomes, and their use in decision-making is dubious, at best.  They have some value as betting venues, if well-calibrated, and some nominal entertainment value (if you enjoy such trivial pursuits).  In contrast, the focus of enterprise prediction markets is on their value for decision-making purposes.

While my focus is on enterprise prediction markets, when public prediction markets fail to work properly, we need to understand why.  My attention has been drawn to a couple of recent cases (among the many) where public prediction markets failed (miserably) to predict the future outcomes.  They concerned the outcomes of American Idol (Betfair) and Britain’s Got Talent (Hubdub, Intrade).  Admittedly, these are very frivolous markets, but if prediction markets do work, shouldn’t they have been better at predicting the outcomes in these cases?  If public markets fail, why would we expect enterprise ones to work?  Both are based on the same theory.

Background – The Failed Markets

Betfair predicted Adam Lambert as the winner of season 8 on the American Idol show.  On May 19, he garnered 76% of the bettors’ money.  Kris Allen, the eventual winner, had 24%.  A few days later, Chris F. Masse blogged about the failure of prediction markets to select the eventual winner of Britain’s Got Talent, where the overwhelming favourite, Susan Boyle, failed to win.  On Hubdub, she closed with 78% of the trade money (none of the other nine competitors had more than 6%), while Intrade sent her off at about 49%.  Both of these prediction markets failed to accurately select the correct outcome.

Prediction Market Theory

Before trying to answer the question as to why these markets failed, we need to review the theory that supports markets having the ability to predict outcomes.  I’ll make this very brief, as I have covered this in my other posts.  I have also put together a companion post “A Lesson in Prediction Markets from the Game of Craps”.

The Efficient Markets Hypothesis holds that market prices accurately reflect all available information.  Since the prediction market shares (or “states”) have binary payoffs ($1 if right, 0$ if wrong), the market price should represent the likelihood of that state coming true when the outcome is revealed.  If the market is not efficient, the market prices will not represent an accurate reflection of the information available to the market.  Therefore, market efficiency is an essential condition for prediction markets to “do their thing.”

Let’s proceed under the assumption that prediction markets are efficient.  In a typical winner-take-all market, there are several shares (states) that may be traded.  Each share has a binary payoff.  Therefore, each share price represents the likelihood of that state capturing the true outcome.  Putting all of the states together provides the entire probability distribution of the market predictions, and with perfect, accurate, complete information, this distribution would be an exact match with that of the actual outcomes.  That is, it would be perfectly calibrated.  The dispersion of the predictions reflects the underlying uncertainty of the outcome.  This uncertainty is caused by future random events that might affect the outcome.

What if the information available to the market participants is incomplete?  By definition, there will be some piece of information that the market participants are unable to consider in making their investment/betting decisions.  Since the market prices are only able to incorporate known information, the market prices will be inaccurate.  Consequently, the market distribution will not match that of the actual outcomes.  As a result, the market must have sufficient information completeness.

If the essential conditions are satisfied, the market distribution will be well-calibrated.  This is the best case scenario for any prediction market.  But can it predict?

Professor Panos Ipeirotis provided an excellent explanation as to why prediction markets must fail to predict actual outcomes some of the time.  He points out that “such failed predictions are absolutely necessary if we want to take the concept of prediction markets seriously. If the frontrunner in a prediction market was always the winner, then the markets would have been a seriously flawed mechanism.” He is entirely correct. A prediction market provides a distribution of predictions that is a proxy for the distribution of actual outcomes.

What he means is that, if the frontrunner in a prediction market always wins, rational traders would always buy the frontrunner shares prior to the market closing, bidding up the price to $1, or just below that.  Any price significantly below $1 would indicate an inefficient market.  Let’s look at the first failed prediction.

American Idol (Betfair)

If Betfair’s market was efficient, Adam Lambert’s winning could not have been a sure thing.  There must have been uncertainty, and so, it is more than reasonable to say there was a 24% chance that he would lose.  If Betfair’s prediction market participants held accurate, complete information (collectively) and the market was efficient, we could say there was an unknowable uncertainty that prevented the market from pushing the Lambert price to $1.

From a decision-making viewpoint, we would be compelled to predict Adam Lambert as the winner (76% likely).  When we rely on a prediction of a discrete outcome, we need it to be correct almost all of the time, which means that the probability of the prediction must approach 100%.  By selecting Lambert, we will be either 100% right or 100% wrong when the contest is over.  There is no “almost right” with discrete outcomes.

We’re presented with a problem.  In order for prediction markets to generate accurate predictions, they must be efficient.  Such markets provide a market price that represents the probability of winning (in this case 76%).  If we needed to make a decision based on the outcome (yeah, right), we would like that probability to be closer to 100%.  However, if that were to occur, we’d hardly need a prediction market to point out the “sure bet”.  We could still make the decision, knowing that about one in every four seasons the “Adam Lambert” prediction will fail to win.  The problem is, we don’t know which season (or trial) this will happen.  This is the problem with discrete outcomes.

In some markets, future random events would introduce uncertainty as to the outcome.  As we all know, random events are unpredictable.  However, in this particular case, the prediction market closed shortly before the outcome was revealed.  The potential for random events to significantly affect the outcome would have been minimized.  Accordingly, I am left to conclude that the market participants did not possess sufficiently complete information about the outcome to make an accurate prediction or the market was not efficient.  In the latter case, the market would not have been good for a betting market, let alone a prediction market.

Public vs. Enterprise Prediction Market Design

Often Public prediction markets are designed as winner-take-all markets, with the shares (or states) corresponding to discrete outcomes.  Think of horses in a race or contestants on Britain’s Got Talent.  Many enterprise prediction markets also utilize the winner-take-all format, with the shares corresponding to ranges of a continuous variable (outcome), such as quarterly sales.  This difference in market design is one of the main reasons why public prediction markets fail to predict outcomes accurately.

Prediction markets may provide accurate distributions of possible outcomes.  Even so, the most likely prediction may not be useful for predicting the next actual outcome.  Where the outcome being predicted is a continuous variable (e.g. quarterly sales), if the market fails, but comes close, it may still be useful, whereas a market of discrete outcomes will only be useful, if it is virtually 100% accurate.

Let’s turn to the public markets for Britain’s Got Talent…

Contrast of Public vs. Enterprise Prediction Market

The prediction market for Britain’s Got Talent had 10 “horses”, the frontrunner being Susan Boyle, with 78% of the trades on Hubdub (49% on Intrade).  On Hubdub, none of the other contestants had more than 6% of the trades and two had 0%.  It was a reasonably tight distribution, yet again, the market failed to predict the actual winner.  Here’s a graph of the distribution of trades on Hubdub.

Hubdub Prediction Market

Hubdub Prediction Market

Chris F. Masse (Midas Oracle) gleefully points out these and similar prediction market failures, as he questions the accuracy and usefulness of prediction markets. He is correct in his reasoning that, if someone wants to rely on a prediction market to forecast an outcome, he needs to have a high level of confidence that the prediction will come true.  With discrete outcomes, even the slightest miss is 100% wrong.  Even when the decision-maker selects the most likely option and it fails to be accurate, it is of little consolation to tell him that the distribution of predictions is accurate.

Contrast this with a similar, winner-take-all market to forecast quarterly sales.  Let’s take the identical distribution of trades from Britain’s Got Talent and match them to a series of states corresponding to quarterly sales ranges.  Then, sort them to create a somewhat normal distribution, as shown.  In this example, quarterly sales is a continuous variable.  Accordingly, the prediction market provides us with a best forecast of $15.220M.  Without having to do the calculations, we can see that 90% of the time, actual quarterly sales will fall between $14.5M and $15.999M.  Given the mean prediction of $15.220, the maximum error (90% of the time) will be 5.1% or $0.779M.  This is the kind of prediction market accuracy that “can be taken to the bank.” Two markets with identical distributions: One predicts very accurately (continuous), the other is a bust (discrete).

Quarterly Sales Prediction Market

Quarterly Sales Prediction Market

Based on this, we can conclude that for most prediction markets involving discrete outcomes, the predictions will be questionable.

It appears that the uncertainty of the outcome depends on future events that may occur between the prediction time and the actual outcome.  When the outcome is revealed, there is no more uncertainty.  It makes sense, then, to say that uncertainty will be correlated with the time remaining until the outcome is revealed.  Therefore, as long as the future random events are reasonably unlikely to occur, or their effects will not be too significant, the prediction market may still provide a useful distribution of outcome predictions.

We see this in a wide variety of prediction markets, where as the market gets closer to the actual outcome becoming known, there are fewer random (unknown) events that might have a significant effect on the outcome.  Uncertainty is increasingly minimized the closer the market gets to the outcome revelation.

In the Britain’s Got Talent markets, we can see another problem with prediction markets – consistency.  Hubdub had Susan Boyle at 78% and Intrade had her at only 49%.  How could they have a 29% difference in the frontrunner likelihood?  Which one is more correct (less wrong)?  Are either of them “accurate”?  These are questions for another paper.

Implications

Prof. Ipeirotis is correct to require prediction markets to be efficient (he’s not convinced they are).  I may be correct to require information accuracy and completeness (at least a sufficient amount) to be contained among the participants.  These are the essential pre-conditions for possibly using prediction markets to accurately predict future outcomes.  Finally, Chris F. Masse is correct to require the prediction markets to have a high degree of confidence in predicting the actual outcome.  This pretty much precludes discrete outcomes from the public prediction market arena (except for “entertainment” or gambling purposes).

Prediction markets should only be used where they are efficient, participants have (collectively) reasonably complete, accurate information and the degree of randomness that is unknown is within an acceptable level at the time the decision is made.  The prediction market must be able to accomplish this feat sufficiently far in advance that the decision-maker is able to formulate an appropriate response to the predicted outcome.  Finally, the prediction market forecast must be more accurate (subject to cost benefit analysis).

Posted by: Paul Hewitt | June 11, 2009

A Lesson in Prediction Markets from the Game of Craps

This is a background paper on several of the important concepts in prediction markets that may be learned from the game of craps (and other dice games).  While the prediction markets discussed are ridiculous (given that the outcome is completely unpredictable), I believe the concepts are well demonstrated.  The concepts in this paper will have direct relevance to my next paper that concerns prediction market accuracy (actually, failure) in public prediction markets.

The betting game of craps is played by rolling a pair of dice.  Each roll is a random event, with known probabilities.  No one knows which number will come up on the next roll, but everyone knows the distribution of outcomes over a larger number of rolls.  Sound familiar?  It looks like this:

Distribution of Dice Rolls

Distribution of Dice Rolls

A Perfectly Calibrated, Accurate Prediction Market (that is Useless for Decision-making)

Now, let’s suspend logic for a minute and run a hypothetical prediction market to predict the outcome of one roll of the dice.  We would expect the distribution of bets (or trades) to form a distribution that is very well calibrated with the actual distribution of outcomes.  For now, let’s assume that it is perfectly calibrated.  At closing, the prediction market indicates that ‘7’ is the most likely outcome, which should occur once in every six rolls on average.  As a decision-maker, relying on the prediction market forecast, you would choose the market prediction as your best guess about the outcome.  When the dice are rolled two sixes come up, making the value ‘12’.

Chris F. Masse (Midas Oracle) is unhappy (maybe not), because the prediction market failed to predict the actual outcome.  Jed Christiansen is somewhat happy, because the market is perfectly calibrated.  He might even be ecstatic if the prediction market involved as few as 12 participants. (I’m joking here, sorry Jed) Professor Ipeirotis says, “What did you expect?”

While the prediction market is perfectly efficient and perfectly accurate (calibrated), it is also perfectly useless for the purpose of predicting a future discrete outcome.  All relevant information is contained among the market participants (information completeness).  The market’s failure to accurately predict the next roll is caused by the randomness of the outcome.  The market is, however, perfectly useful for the purpose of betting (i.e. craps), as the odds are calibrated, perfectly, with the outcomes, making it a fair game.  As in the game of craps, we are dealing with random outcomes, which by definition are unpredictable.

Now, let’s drop one die and continue to explore the properties of these prediction markets…

Calibration Loss Caused by Information Incompleteness

Distribution of Single Die Rolls

Distribution of Single Die Rolls

To the right is the distribution of all potential outcomes for rolls of a single die.  If we were to run another hypothetical prediction market on the outcome of a single die roll, the distribution of trades should perfectly match this distribution, assuming a fair die is used and all participants know this.  That is, the market has perfect, complete information about the die roll.  The resulting distribution is perfectly calibrated with the distribution of actual outcomes over a large number of trials.

Now, let’s add a twist.

All participants are told that the die being used is “loaded”, such that it will turn up one number more often than any of the others.  Everyone has accurate information, but it is incomplete:  no one knows which number is more likely to be rolled.  What will the prediction market distribution look like?  It should be identical to the first case, because the participants would be expected to evenly spread their trades across all possible outcomes.  In this case, we have accurate, but incomplete information, and the resulting market distribution will not be well-calibrated with the distribution of actual rolls.  The market can be said to be efficient, accurately reflecting the information held by the participants, but it is not an “accurate” prediction market, because it is not well-calibrated.

Calibration Restored, with Completeness Overcoming Information Inaccuracy

Distribution of Loaded Die Rolls

Distribution of Loaded Die Rolls

Let’s try another variation.  One of the traders is told that the die is loaded to turn up the number ‘4’ twice as often as any of the other numbers.  All other traders are kept in the dark (i.e. their information is inaccurate and incomplete).  Assuming the knowledgeable trader has sufficient wealth to move the market (i.e. the market is “efficient”), the prediction market distribution should be accurately calibrated with the actual outcomes.  In a perfectly efficient market, we would expect to see the following distribution.

Even though almost all participants had inaccurate information, the market does contain all of the information necessary to determine the true distribution.  If the prediction market did reveal this distribution, we could say that it operates as an efficient mechanism for revealing the true, complete information held within the group.

Lessons Learned

  1. Markets must be efficient to accurately reflect the information available within the market.
  2. Not all market participants need to have accurate or complete information, so long as the market is efficient and the market, collectively, holds complete information.
  3. These conditions are necessary for a prediction market to provide a distribution that is well-calibrated with that of the actual outcomes.
  4. Even a perfectly calibrated distribution, based on perfect, complete information may not be useful for predicting an outcome.  This is particularly true when dealing with discrete outcomes.
  5. A prediction market is even less likely to be useful, when there is a significant randomness inherent in the process of generating an actual outcome.
Posted by: Paul Hewitt | June 4, 2009

Isn’t it the truth

“Don’t worry about people stealing your ideas.  If your ideas are any good, you’ll have to ram them down people’s throats.”

Howard Aiken

Posted by: Paul Hewitt | May 26, 2009

The Forgotten Principle Behind Prediction Markets

Background

In his book, “The Wisdom of Crowds”, James Surowiecki gave an interesting and insightful account of the conditions for, and methodology of, predicting future outcomes.  To summarize, markets are able to predict outcomes where there is sufficient diversity, independence and decentralization of the market participants.  It is explained that to the extent these conditions hold true, the market will provide accurate predictions.  It works, because the law of large numbers ensures that uncorrelated errors “cancel out”, leaving behind “pure information”, as reflected in market prices.  Not only does it make a lot of sense, intuitively, there is ample support from economic theory.

Economic support for the efficacy of prediction markets ultimately derives from Adam Smith’s “invisible hand”, Hayek’s “The Use of Knowledge in Society”, and Eugene Fama’s Efficient Market Hypothesis.  Taken as a whole, they support the position that market prices fully reflect all available information about the product or asset under consideration.  A Prediction market uses this concept to make the same assertion about a future event, condition or action, to produce a “best estimate” of the uncertain outcome.

The Key to Market Efficiency

Kenneth J. Arrow and Gerard Debreu proved that free markets are able to optimally allocate resources, under certain circumstances.  One of the key assumptions behind their general equilibrium theory is that all market participants possess complete information.  Every trader in the market knows the price that each participant is willing to pay or receive for each good.  Surowiecki described Vernon L. Smith’s classroom laboratory experiments that were designed to test the economic efficiency of markets.  While these “markets” were highly simplified, they were able to show that markets can allocate resources efficiently, even when every participant does not have “complete” information.  However, Surowiecki neglected to mention why the experiments continued to work when the information completeness assumption was not met.

They worked, because the market participants, collectively, possessed “complete” information, even though none did, individually.  The market mechanism served to induce each participant to reveal his or her information in the marketplace, ultimately revealing the “complete” set of information through the supply and demand functions and the market clearing price.  While these markets were very simple, with a single product and a relatively small number of participants, they did reveal the power of markets to assemble the information necessary to perform complex, efficient resource allocations.  They operated as if all participants were privy to all of the information.  They worked because all of the information was available within the group.

Out of the Classroom… and Into the Real World

In the real world, with significantly more complex markets, products and human relationships, the ability of markets to perform similar feats of information revelation is heavily dependent on the collective information held by the market participants.  To the extent that the participants’ information (taken as a whole) is incomplete, this will be reflected in uncertainty, or price dispersion, in the market.  Where there are significant pieces of information unknown to any of the market participants, the markets are highly unlikely to provide accurate prices or tight dispersions.  In short, the market will not be able to create information that is not already known to the participants. Which leads us to…

The Forgotten Principle Behind Prediction Markets

Prediction markets must have sufficient information completeness to accurately predict outcomes with a reasonable degree of certainty.

Each of Surowiecki’s prediction market conditions (diversity, independence and decentralization) relates to this overarching principle and serve to improve the pool of information held by the market participants, but it is not enough to simply have them present; they must operate to amass a reasonable level of information “completeness”, too.  Of course it is difficult to know, in advance, whether a pool of market participants holds enough information to be considered “complete”.  Hence, we tend to rely on Surowiecki’s three conditions, but we have forgotten that they are really a collective proxy for information completeness.

Very simple markets, with few variables influencing the future outcome, are likely to provide accurate predictions from small groups of traders, because most of the information necessary to make the prediction is known within the pool.  The more complex the factors affecting an outcome become, the greater the information required to be known by the pool of traders.  One can easily imagine exponential growth in the information requirement, as the outcome becomes subject to additional causal factors.

Most (if not all) researchers and academics seem to have lost sight of this information completeness principle.  We have seen a significant volume of work directed at solving the problem of market liquidity through the use of automated market maker mechanisms.  These solutions only became necessary, because it was difficult to gather together a sufficient number of participants and provide adequate incentives for them to keep trading and revealing their private information.  Some markets were simply too thinly traded to be “efficient”, without the use of an automated market maker.  Of course, it was cheaper to operate a market with fewer participants, too.

While this solved the liquidity problem of some prediction markets, it created other problems.  With fewer traders, there was often less diversity, independence and decentralization.  All of these factors, combined with the fewer traders meant that information completeness was bound to suffer in all but the simplest of markets.  Also, with too few traders, the process of cancelling out the uncorrelated errors of the traders breaks down.  While such markets may appear to operate efficiently, this may not be so, and worse, we will not know whether there was sufficient information completeness within the market.  Consequently, it may not be appropriate to rely on the predictions of such markets.  Their predictions will be unreliable, inconsistent and subject to too much uncertainty.

My point, here, is that you can’t “fake” an efficient market and hope to achieve the level of accuracy and certainty that a truly efficient market might provide.  An automated market maker may be acceptable, but not when it is used in place of a sufficient number of diverse, independent, decentralized traders.  There is no way to replace or create the information that is not brought to the market by the traders themselves.  However, an automated market maker mechanism may be acceptable when there are an insufficient number of active traders from time to time.

Where to Now?

If a reasonable degree of information completeness is a necessary precondition for prediction market accuracy, how will we know if it has been satisfied?  As stated above, we don’t really know whether the condition is satisfied for most prediction markets.  We have to rely on optimizing the quantity of traders, while maximizing their diversity, independence and decentralization, under a cost constraint.  To a large extent, this requires trial and error in the field.  Market specifications for one market may not work as well with other markets.  It will be necessary to increase the number of real markets and learn what works, what doesn’t, and why.

Over time, it may be possible to identify trader pools that are particularly strong in predicting certain types of outcomes, because of their combined knowledge, diversity, etc…  Over time, we may be able to identify certain types of outcomes that may be predicted with a reasonable degree of uncertainty (and others that are not so predictable – e.g. earthquakes).

With a greater number of real world prediction markets, we will learn more about the factors that enhance their calibration to actual outcomes.  Right now, there are too few examples to say anything about individual market calibration levels.  More trials will provide valuable insight into the factors that generate consistency in specific market predictions.  So far, the published trials have not shown any reasonable level of consistency.

Is there an effective method of pre-screening traders that will help ensure that the total pool of information is maximized for a particular market (or class of markets)?  This might be quite costly for an individual market, but if the costs can be spread over a class of markets, run multiple times, it may be worth the effort.

If a greater degree of information completeness helps reduce uncertainty, and it should, the resulting distribution of predictions will tell us whether the outcome is predictable with a reasonable degree of uncertainty.  If a particular class of markets is unable to reduce uncertainty to an acceptable level, we can stop using it for predictive purposes.  We may still be able to use it as a measure of uncertainty for risk management, however.

Posted by: Paul Hewitt | May 26, 2009

Calibration = Prediction Market Accuracy?

In response to a recent paper I wrote on prediction market accuracy (or lack thereof), I received counter arguments claiming that a prediction market may still be “accurate” even though the prediction fails to accurately predict the actual outcome.  The argument was that “accuracy” is found in the calibration of the market predictions to the actual outcomes.  Let’s look at this concept as it applies to two types of markets:  a pari-mutuel horse race and a “winner-take-all” prediction market (using sales as an example).

Pari-mutuel Calibration

Pari-mutuel horse race markets (betting pools) are very well-calibrated (lots of examples and lots of proof of this).  That is, the odds generated from bets placed do, in fact, reflect the actual distribution of outcomes averaged over a large number of trials (races).  We shouldn’t find this particularly remarkable, as long as the market (bettor pools) possesses a reasonable degree of information “completeness”.  This probably holds true, given the fairly large number of diverse track bettors for most races.  Consequently, horses with a 10% chance of winning, based on the bets placed, will win about 10% of the races.  Here, the results are averaged over many, many races, just as they are when a coin is tossed many times and “heads” comes up 50% of the time.  Having a high degree of calibration in these markets ensures that the odds are “fair” to the bettors.

If we want to cash the most winning tickets, we would place bets on the favourite in every race.  The favourite has the highest likelihood of winning.  This doesn’t mean that the favourite will win, just that the odds of that horse winning are better than those for any of the other horses.  If there was no track “take” (i.e. no cost to play), it would be a zero-sum game.  You could bet on any (or all) horses and expect to come out “even” in the long-run.  Not much fun in that!

While pari-mutuel horse race markets are set up for the primary purpose of wagering, they do provide a frequency distribution of bets placed on each of the horses, which provides some predictive information about the future outcome (winner).  However, pari-mutuel horse races are different from “winner-take-all” prediction markets that attempt to predict the actual future value of a continuous variable (future sales for example).  In a horse race, the possible outcomes are discrete (horses).  In the horse race, unless the horse with the highest likelihood of winning does win, the market has failed to predict accurately, despite the fact that the pari-mutuel market is “well-calibrated.”  This is fine for a betting pool, but it is of little use in a corporate prediction market.

Enterprise Prediction Market Calibration

Many enterprise prediction markets are formulated as “winner-take-all” bets, which provide distributions of predictions about uncertain outcomes, somewhat similar to those of horse race markets.  Ideally, we would like these distributions to accurately reflect the distribution of actual outcomes that are being predicted.  Seems obvious enough, but how do we know when a prediction market is well-calibrated with the distribution of actual outcomes?  I have yet to see a study that has run similar enterprise prediction markets enough times to obtain an accurate distribution of actual outcomes that could be compared with prediction market distributions.  Aren’t we really assuming that prediction markets are well-calibrated?

In a prediction market, the decision-maker is hoping to derive an accurate prediction of the actual outcome.  Given that we are looking at an uncertain outcome, there will always be an error factor associated with the prediction.  If the most likely state (or share) does not capture the actual outcome, we hope that the next most likely state will.  That is, we want the most likely state to be as close as possible to the actual outcome.  Contrast this with a horse race.  There is no decision-maker other than the bettor.  The bettor selects a horse to win.  If the horse does not win, it doesn’t matter which of the other horses actually won, the bettor loses.  In an enterprise prediction market, the decision-maker does care which of the other “horses” (states) “wins.”

The difference is that an enterprise prediction market usually attempts to predict a continuous variable, such as quarterly sales, whereas a horse race market attempts to predict a discrete outcome.  In such a prediction market, we can derive the average sales forecast figure.  This is the figure representing the best estimate of the future outcome.  This is the figure that must be “accurate” for it to be useful in decision-making.  In a horse race market, such an average is meaningless (i.e. the 2.6th horse?), because the horse numbers (or names, or positions, etc…) are not related in any meaningful way.

The value of calibration is that it verifies the extent to which the distribution of predictions (bets) matches the actual distribution of outcomes.  This tells us the extent to which we may rely on the prediction market distribution as a proxy for the underlying uncertainty of the actual outcome.  It also tells us (if well-calibrated), how much uncertainty exists surrounding the future outcome.  If there is a great deal of uncertainty, the prediction market will not be very useful.

Now let’s look at a prediction market that has near perfect calibration, but a nearly flat distribution of bets (opinions).  Some might argue that the market is “accurate”, but it is useless for decision-making purposes.  The market is telling us that the outcome is too unpredictable.  However, the market could be used for betting, because it is well-calibrated. Think of a betting market on the outcome of rolling a fair die.

Conclusion

The point of this discussion is that prediction markets should be well-calibrated, but this is not a sufficient condition for their usefulness.  They must also provide accurate predictions, with relatively tight distributions.  The maximum allowable dispersion of the distribution will depend on the materiality of the forecast error.  That is, the prediction should be accurate enough, such that the maximum allowable error would not cause the decision-maker to alter his or her decision had the true value been known in advance.

Where the distribution of the prediction market is not tight, the market may still have some use, but not so much for being able to predict the outcome.  Instead, the market will be providing information about the degree of uncertainty surrounding the outcome.  This may indicate the need for greater care in assessing risks and the need for more extensive contingency planning.  A flatter distribution may indicate that the market is not functioning properly (lack of information completeness, perhaps).  Alternatively, a flat distribution may indicate that the variable being predicted is, simply, not predictable.

Posted by: Paul Hewitt | May 5, 2009

The Future of Prediction Markets – Part I

We would like to be able to run a prediction market to predict the future adoption of prediction markets (public or private), but we can’t.  There is no way to verify the outcome to determine which option would pay off.

Based on my research to date, this is where I think prediction markets are heading and where I think they should be heading.  In this paper, I will focus on Enterprise Prediction Markets.  A subsequent paper will cover Public Prediction Markets.

Private (Enterprise) Prediction Markets

In my view, these provide the most promise for future adoption, despite the almost insurmountable problems they have gaining acceptance in the corporate setting.  I am optimistic, however, because I believe prediction markets do have the potential to be better predictors of the future than other forecasting methods, at a lower cost.

My review of the literature and case studies (that have been published) indicates that prediction markets have improved the accuracy of forecasts, but the improvements have not been great enough to encourage widespread (or even minimal) acceptance.  Furthermore, these studies like to average their results over a number of markets, disguising the fact that some markets improve forecasts, while others fail to do so.  Some studies look at average absolute errors, covering up the fact that some predictions were underestimating the true outcome and others overestimating it.  This means the real errors are as much as twice as large as those reported.  Few, if any, explanations for the failures are ever presented. This raises the issue of consistency.  In case studies such as these, where there is no clear under- or over-estimation tendency, for which a correction may be made, the prediction errors are just too great.

Clearly, if similar prediction markets do not provide consistently accurate forecasts, they will not be relied upon for any important business decisions.

Businesses make estimates and forecasts in virtually everything they do.  Every decision model accepts inputs which are estimates, predictions or forecasts of likely scenarios for future conditions, events and actions.  Decisions made are only as good as the model used and the accuracy of the data being used.  “Garbage-in, Garbage-out” doesn’t just apply to computers.  There is a clear profit incentive for companies to improve their decision-making, by improving the quality of the data relied upon.  Traditional forecasting models have a spotty track record for accuracy.  Prediction markets may be a good alternative to, or add value to, traditional forecasting methods.

To be useful in the corporate world, prediction markets must provide forecasts that are more accurate than traditional methods, or be a cheaper alternative of providing equivalent forecasts.  Only if this pre-condition has been met, can we look at the other potential benefits.  It makes no sense to talk about how quickly or cheaply a prediction market gives a forecast, if the forecast is wrong!  Therefore, the focus must be on accuracy.  Let’s get it right, first.  Then, we can make it better or more efficient.

Once the accuracy and consistency issues have been met, prediction markets can be relied upon to provide a measure of the uncertainty surrounding the forecasts.  It does this with an objective distribution of “votes” around the mean prediction.  It is a particularly useful measure, with applications in risk management and contingency planning.

Assessment of Enterprise Prediction Markets (EPMs) to date:

  1. they have some ability to improve the accuracy of forecasts in specific situations;
  2. an ability to reduce (and measure) uncertainty of the forecast;
  3. perform a relatively fast aggregation of traders’ predictions, and
  4. are a relatively cheap forecasting method.

EPM Deficiencies:

  1. We don’t know why they don’t work in some cases (even with similar markets);
  2. Most forecasts are not significantly better than traditional methods (yet);
  3. They lack consistency;

Future Research (just a few):

  1. Prediction markets require a crowd of people, with as much diversity as possible, holding privately-generated independent information.  Future research must focus on how to achieve these characteristics.  Too often the research has focused on how to get around the need for a “crowd”, seemingly forgetting that reducing participation will also reduce diversity and completeness of the information contained in the crowd.  Mistake.
  2. We need to know the determinants of accuracy and consistency.  Find out what makes some markets work well, while others fail.  Find out why there is a lack of consistency in the predictions obtained from similar markets.  Then correct for these deficiencies.
  3. Find out which types of issues are best suited for prediction markets, and discard those that will never provide accurate, consistent predictions.
  4. Find out what makes a good “crowd”.
  5. Find out how to get a good crowd and keep them motivated to reveal their private information.

Of course there are many other issues related to EPMs, but I believe these are the crucial, must solve ones.  Without accuracy and consistency, EPMs will be nothing more than a novelty.

Posted by: Paul Hewitt | May 3, 2009

Prediction Market Accuracy and Usefulness

Consensus and Differences of Opinion in Electronic Prediction Markets Thomas S. Gruca, Joyce E. Berg and Michael Cipriano (2005)

I came across an obscure paper that delivers some interesting findings about the capabilities of prediction markets in the real world. Google Scholar indicates that this paper has only six citations, yet I found it to be very useful, because it involves a real world case study that examines three aspects of prediction markets:

  1. How well do prediction markets capture private information held by traders?
  2. Do prediction market prices reflect the dispersion of trader forecasts in addition to the consensus?
  3. How does the composition of the trader pool affect the disclosure of private information?

The authors conclude that prediction markets are able to aggregate privately held information quite well, they are able to aggregate information about the consensus of private information and its dispersion, and that ‘open’ markets result in better predictions than ‘closed’ markets of homogeneous traders.  Consequently, corporate prediction markets should not be restricted to in-house participants.  In this blog, I critically examine these conclusions and provide additional insight into the issues raised.

Background

The authors start with the premise reached by Plott and Sunder (1982, 1988), who were able to show that markets are able to disseminate information from “informed” traders to the uniformed traders.  Where there is perfect information (no uncertainty), it is effectively communicated from the informed to the uninformed.  Where the information is “complete” (sum of all information reveals the true state), market prices accurately predict the outcome.  Where there is uncertainty or the information set is not complete, prices may deviate from their expected values and lose the power to predict accurately (Sunder 1995). Their conclusions were based on laboratory experiments, involving a simple, hypothetical situation.

The authors of the current paper decided to test these conclusions in the real world.   They chose to run a series of markets, similar to those run by the Hollywood Stock Exchange (HSE), involving predictions of four-week box office receipts for 11 different movies openings (November 1998 – November 2002).  Each market involved 4 – 6 “winner-take-all” securities.   Trading took place on the Iowa Electronic Market (IEM), using its continuous double-auction mechanism with real money trades.  Trading commenced between four and 14 days before each movie opened in the theatres.  A Market prediction was obtained immediately before each movie opened, though trading continued during the movie’s run.

In order to test the market’s ability to aggregate private information held by traders, the authors collected forecasts from traders before they started trading.   This provided a measure of the private information held by the traders (as opposed to public information revealed by prices or other means).   Most of the traders were marketing students who completed a project in which they were asked to forecast movie box office receipts, performing their own analyses, using any information they could find.  There were four “closed” markets, in which all of the traders were students who had submitted their private forecasts before trading.  There were also seven “open” markets in which other self-selecting traders were allowed to participate, using their own money.   Here, the term “forecasts” refers to the students’ prior forecasts, and “predictions” refers to the prediction markets’ predictions.  This will make it easier to follow the analyses.

Do Prediction Markets accurately incorporate Private Information?

Yes. The authors compared the means of the students’ forecasts before trading in the market with the mean prediction implied by the market prices just before the movie opened.  They found a correlation of 0.99, indicating that the prediction market prices were accurately reflecting the private information held by the traders.

Do Prediction Markets reflect the Dispersion of Traders’ Forecasts (based on private information)?

The traders’ private information was incorporated and reflected in their forecasts (made prior to trading).  The degree of dispersion of these forecasts is described by the standard deviation.  Similarly, the authors calculated the standard deviation implied by the contract prices obtained from the prediction market.   They found that the market standard deviation was smaller than that for the students’ forecasts in every market, indicating a tighter distribution in the prediction markets and, presumably, a less uncertain prediction.  Some of the reasons put forth to explain the tighter distribution were that:

  • extreme forecasts get changed by some traders, when they see the other traders’ forecasts, as reflected in   market prices;
  • the number of contracts in the market may have affected the standard deviation, and
  • the assumption of a normal distribution may affect the true standard deviation.

So, they compared the actual market contract prices with those that would be expected if the entire distribution of students’ point forecasts (private, prior) were used to determine the contract prices.  That is, using the frequency data from the point forecasts, they estimated the probability of each contract paying off.  The expected contract prices should correspond to those observed in the market, if the entire distribution of students’ private information is being reflected in the contract prices.  Here, they found that the correlations were significant in 7 of the 11 markets, with the average being 0.81.  However, the correlations were particularly poor in two markets.  They cite three possible reasons for the poor correlations:

  • Additional information was obtained by traders after their point forecasts were made (and reflected in market prices only);
  • Other, non-student, traders (no prior forecast) were more influential in setting market prices than were the student traders (these markets appear to have been dominated by non-student traders, who had very different information), or
  • There was a market failure.

No conclusion was reached.  We might say that if either of the first two explanations is true, that is a good thing.  We want prediction markets to incorporate new information and the information provided by new participants.   Also, we want the market to determine which traders will be most influential in setting prices, based on their own individual predictions and degrees of certainty.  That is, just because the students did some research doesn’t mean that their forecasts should dominate in the prediction market.  They may not be very good forecasters.

Does the Composition of Traders Affect Market Accuracy?

There were two classes of markets – ‘open’ and ‘closed’.  The closed markets included only students who had completed the project of forecasting movie receipts before they began trading.  Open markets included other real money traders, who self-selected into the markets.

In order to estimate the accuracy of the prediction markets, the authors looked at the absolute percentage error of the predictions and forecasts (private, priors).   They found a mean average percentage error (MAPE) of 0.29, or 29% across all markets.  The MAPE for the seven open markets was 17%, but for the four closed markets it was 50%.   The authors conclude that adding additional traders to the mix improves the accuracy of the prediction markets.   They imply that corporate prediction markets should consider opening the markets to traders not normally involved with the forecast, in order to improve the accuracy of the predictions.

There are several problems with this analysis. The authors’ conclusion is wrong.   Looking at all of the students’ forecasts, we find that the MAPE was 33%.  We also find that it was 57% when they were in ‘closed’ markets, but only 20% when they were in ‘open’ markets.  The students did not know which market they would be in prior to making their forecasts, so it should be irrelevant.  We need to look, solely, at the overall accuracy.

By applying a bit of my own math, I find that the percentage improvement of the market predictions over the initial student forecasts is about 11.7%, and it does not matter much whether the market is open or closed.  Both open and closed markets experienced gains in accuracy (11.5% and 12.0%, respectively).  However, two of the seven open markets actually had a higher error than the initial forecasts made by the students prior to the market opening.  This was not explained by the authors.  I will provide one explanation, below.  We cannot attribute any effect on accuracy to whether the market was ‘open’.  Instead, the average error appears to be more dependent on the particular movie’s receipts being forecasted.  Some movies are harder to predict than others.  Maybe these markets are not appropriate for obtaining useful predictions, given the makeup of the trader pool.

UPON FURTHER EXAMINATION…

I took the data disclosed in this paper and ran it through my own analysesMy Analysis.   I segregated that open and closed market data, so that all analyses could be compared between the two groups, if necessary.   I calculated the average percentage error for the student forecasts and for the market predictions, to see how much of an improvement (if any) was obtained by running the prediction markets.   I calculated the decrease in the standard deviation between the student forecasts and the market contract prices, to see whether the prediction market helped to reduce the uncertainty of the prediction over the students’ initial forecasts.

The authors calculated the percentage error with the actual outcome on the denominator.  They also looked only at the absolute error (i.e. didn’t matter whether the market under or over-estimated the outcome).  If the Hollywood executives were to use the forecasts or market predictions in their decision-making, the error should be calculated using the forecast figure as the base (denominator), as this is the figure they would be using to make decisions.   I made this adjustment.   I already had the standard deviations for each market, for the students’ forecasts and for the market predictions.  Armed with this, I thought it would be interesting to see whether the prediction markets outperformed the students in their forecasts of the actual movie receipts.

Would Hollywood executives rely on these prediction markets?

The answer has to be ‘no’.

As mentioned above, the average absolute error of the market predictions was 29%, which is only an 11.7% improvement over the students’ initial forecasts.  This shows that prediction markets do bring about some improvement in forecasts of the future, but is it good enough to be used in decision-making?  The answer has to be ‘no’ in the case of predicting future movie receipts (at least with these trader pools).

Using the absolute percentage error disguises the fact that the errors go both ways (some were under- and others were over-estimated).   Further, the prediction markets provide no guidance as to which way the error is likely to fall.  Therefore, the real error is much larger than the absolute (value) of the percentage error.  It is, perhaps, as much as twice the error calculated by the authors.  Consequently, the real prediction market error might be as high as 58% in these markets.

We also saw that the predictions in two of the markets were worse than the initial forecasts (and we don’t know why this happened).  This speaks to the consistency issue.   If prediction markets cannot provide consistently accurate predictions in similar situations, how can they be relied upon for decision-making purposes?

What went wrong?

The authors considered the information that students obtained through their research and analyses as being “private”.  Except to the extent there may have been “collusion” in the development of individual forecasts (i.e. “study groups”), the students’ conclusions were privately held.  However, students would not be privy to industry information that would be available to Hollywood executives, film distributors, theatre owners, film critics, etc.  Instead, the students only had access to publicly available information on which to base their forecasts.  So, I think it is safe to say that the information available to the traders (collectively) was not “complete.”

Since completeness a pre-condition for market prices to predict the true outcome, it is not surprising that these markets failed to accurately predict movie receipts.  The trader pool was not diverse enough to have in their possession enough information to predict the outcome accurately.

These markets showed that prediction markets are able to reflect participant information fairly accurately, but if there isn’t enough information from the traders, the prediction may not be very good.  The conclusion has to be that diversity in the trader pool must be sufficient to include most of the relevant information needed to make an accurate prediction.

Perhaps a reduction in uncertainty has value?

In my analysis, I calculated the improvement of the dispersion in the prediction markets, relative to the initial forecasts.  Overall, the standard deviation in the prediction markets was about 35% tighter than that of the student forecasts.  It appears that trading in a prediction market helps to focus the participants’ estimates closer to the mean.   On the face of it, we would say this is a good thing.  The market is less uncertain about the forecast than a flatter distribution would indicate.  But, in these markets, the predictions have very large errors.  In a word, they were inaccurate.

Let’s examine this from a decision-making point of view.  We would expect a range, one standard deviation around the mean, to capture the actual outcome 68% of the time, if the distribution is normal.  The actual movie receipts were contained within this range for the students’ mean forecasts in 7 of 11 markets.  Perhaps about what one might expect, given that the students were not “experts” in forecasting movie receipts.  Here’s the kicker: The market predictions failed to fall within this range in 8 of the 11 prediction markets! Put another way, had the executives making decisions on a range of potential movie receipts, that was within one standard deviation of the market prediction, they would expect their prediction to be correct 68% of the time.  This did not happen in these markets.  We aren’t even looking at whether this level of accuracy is adequate for their decision-making purposes.   (I doubt it would have been).  So, even though the prediction markets had tighter distributions, they did not appear to be usefully more accurate than the students’ forecasts.

We find that a tighter distribution around an inaccurate forecast can make for very poor decisions.

It makes no sense to be “more sure” (or less uncertain) of a wrong forecast.

Posted by: Paul Hewitt | April 29, 2009

A Cheaper Alternative to Prediction Markets?

Recently, I came across this article in The Economist that discusses the genius and extraordinary abilities associated with autism.  It was thought provoking.

Genius locus – The Economist, April 16, 2009

In the article, it mentions some of the tasks that autistic people seem to have an uncanny ability to perform.  One of them was cited here:

“It helps them, too, with other tasks savants do famously well—proofreading, for example, and estimating the number of objects in a large group, such as a pile of match sticks.”

The public popularity of prediction markets jumped with the publishing of James Surowiecki’s book on the Wisdom of Crowds.  He introduced the topic by describing how the crowd’s wisdom was far superior to that of any individual at the county fair.  Other examples include guessing the number of jelly beans in a jar.  It seems to me that, if an autistic individual is able to estimate things like the above example, perhaps they might be just as good at estimating other things.  In effect, they are predicting the answer.  If so, maybe we don’t need a crowd, we need just a few – but autistic ones.

Next, the article discusses whether similar types of feats can be “learned” and considered London taxi drivers as a possible example.

“There are, however, examples of people who seem very neurotypical indeed achieving savant-like skills through sheer diligence.  Probably the most famous is that of London taxi drivers, who must master the Knowledge—ie, the location of 25,000 streets, and the quickest ways between them—to qualify for a licence.”

“The prodigious geographical knowledge of the average cabbie is, indeed, savant-like.  But Dr Maguire recently found that it comes at a cost.  Cabbies, on average, are worse than random control subjects and—horror—also worse than bus drivers, at memory tests such as word-pairing.  Surprisingly, that is also true of their general spatial memory. Nothing comes for nothing, it seems, and genius has its price.”

I might add another side-effect of their learning process.  There seems to be a very high correlation between back injuries and being a London taxi driver.  I’ve found this to be used as a convenient excuse for not lifting even the lightest of suitcases when picking up a passenger.

Maybe more accurate predictions are only a fare away!

Posted by: Paul Hewitt | April 25, 2009

Judging Accuracy in Prediction Markets

I’ve had a chance to review Emile Servan-Schreiber’s paper, Prediction Markets:  Trading Uncertainty for Collective Wisdom.  The paper indicates that it will be included in an forthcoming book on Collective Wisdom.  It summarizes some of the evidence in support of the accuracy of prediction markets.  I wholeheartedly agree with the author’s contention that diversity is a key determinant of prediction market accuracy.  I agree that characterizing prediction markets as being more like “betting exchanges” is appropriate, too.  However, I disagree that the established research has proven the case for the accuracy contention, as I hope to explain, below.

While the paper is a good summary of many of the key aspects of prediction markets, in arguing that prediction markets are accurate forecasting tools, the author cites the HP prediction market results (6 out of 8 performed better than the “official” forecasts) as one of the proofs.  It has been a decade since these prediction markets were run and still it is one of the most frequently cited proofs of prediction market accuracy.  However, the author denigrates this finding, somewhat, by noting that “beating official company forecasts isn’t always as hard as it sounds, because the goal of an official forecast is often more to motivate employees towards a goal than to predict outcomes.”

It appears that the author is saying that it shouldn’t be too difficult to beat an “official” forecast, because it is biased (in order to motivate).  Depending on the definition of “official forecast”, to some extent I might be able to agree with this assessment.  However, the fact that HP’s prediction markets did not beat the official forecast in every instance speaks to the contrary.  Furthermore, if it isn’t that difficult to beat an “official” forecast, why didn’t the HP markets do so by a significant margin? Prediction markets are supposed to reduce the bias inherent in other forecasting methods.  If the official forecasts are biased, we should not be comparing them with prediction market forecasts at all.  The true accuracy of  prediction markets depend on their ability to accurately and consistently forecast actual outcomes. If alternative methods are not trying to predict the same thing, we shouldn’t be comparing them.  Here are my comments…

What, exactly, was the “official” forecast that was used in comparison with the prediction market forecast?

Was it an internal sales budget? Such budgets (forecasts) are routinely used to set target quotas for sales teams.  The bar is usually set a bit higher than it should be, to motivate the team to “try harder” to meet the objective and earn a bonus.  The budget cannot be too high (optimistic), otherwise it will have a de-motivating effect.  If we look at the eight HP prediction markets that had official forecasts, we find one that was almost bang-on, four that were significantly below the actual outcome and three that were above.  Of the three that might be considered “motivationally-inflated” official forecasts, two appear to be reasonable, with errors of 13% and 4%, but the third was overstated by a whopping 59%!  Three of the four understated official forecasts were significantly below the actual outcome (28% – 32%).  None of the understated official forecasts could be described as “motivational”. After all, you don’t lower the bar to motivate higher jumping.  We might have expected all of the prediction market forecasts to be 5%-10% lower than the official forecast (if it was a sales budget), but they were not.  Bottom line: Even if the official forecast was really a sales budget, it would never have been lower than the expected (most likely) sales outcome, nor should it have been too much higher.

Was it an “official” forecast provided to market analysts? Obviously, not all product sales forecasts are provided to analysts (though some are), but certainly, sales projections by product line or division would be commonly disclosed.  These figures would be derived by aggregating the sales projections of individual products or lines.  Corporate management are required to disclose all significant, relevant information (public companies).  If management were to issue inflated “official” forecasts to the market, the analysts would clobber the share price when the true sales (outcome) became known.  If management is consistently optimistic in their forecasts, analysts will discount their forecasts and take it out on the share price.  Management is unlikely to be consistently pessimistic as this would serve only to put downward pressure on their share value.  Analysts are able to spot a company consistently “jumping” over a “low bar.”  Bottom line: If the official forecast is the one that is publicly disclosed, it is likely to be close to management’s best estimate of the sales outcome.

Management needs to make a variety of decisions (production, distribution, marketing, sales, HR and finance, etc…) that depend upon the best estimate of future sales.  To make such decisions using a biased forecasts would be foolish and potentially very costly.  The important (useful) forecast is the one that will help management make better decisions.  This is the forecast that management needs to predict more accurately, not a “tool” such as a sales budget.

Given that HP used the term “official” to describe the forecasts that were being compared with the prediction market forecasts, it is likely that the official forecasts were the true best-estimates of the future outcomes.  If they were, in fact, merely sales budgets, we would expect the prediction market forecasts to always be lower than the budget, and this was not the case.  Consequently, if a prediction market is able to beat the “official” forecast, consistently, it should be considered a better forecasting tool than that used to generate the official forecast.

I have already written about my objections to the HP study, where I recognized that most of the prediction market forecasts appeared to be better predictions of the official forecasts than they were of the actual outcomes.

Since I’m discussing the “official” forecasts, here, I would add that the HP prediction markets were run before the “official” forecasts and some of the participants were also involved in the setting of the “official” forecasts.  No wonder these forecasts were correlated.  The slight improvement of the prediction market forecasts over the official ones may indicate the slight effect of the small amount of additional diversity in the prediction market group over the “official” forecasting group.  It could also be explained by the internal “political” climate that influenced the official forecast, but not the prediction market forecast.  Either way, it is not a sound comparison for proving prediction market accuracy.

We still have a long way to go in proving the case for enterprise prediction market accuracy.  I believe the academics have given sufficient theoretical support, but the real proof is in the field.

Posted by: Paul Hewitt | April 12, 2009

an Analysis of HP’s Real Prediction Markets

The following article discusses the results of Hewlett-Packard’s trials with predictions markets in the late 90s.  I’m posting my comments as a review and critique of this paper.

Information Aggregation Mechanisms: Concept, Design and Implementation for a Sales Forecasting Problem Kay-Yut Chen & Charles R. Plott.

At the outset, I’d like to commend the authors for publishing their data. Even though these markets were run more than a decade ago, there have been virtually no other published results to date. Unless we are able to review actual case studies of real prediction markets, the future of the prediction market “industry” will be bleak (no prediction market is necessary to reach this conclusion). If I appear to be overly critical of some of the authors’ conclusions and methodology, I apologize. My intent is to point out areas in which prediction markets may be improved for use in a corporate setting.

Background

In this paper, the authors report on the results of HP’s internal prediction markets to forecast sales. The 12 prediction markets were run between October 1996 and May 1999. Their goal was to take prediction markets (Information Aggregation Mechanisms) out of the laboratory and into the field, to see how they work in a practical setting. Most markets attempted to forecast monthly sales of particular products, three months in advance.

To be fair, the design and implementation of these markets was constrained by management. Each market was open for one week only and for a limited time period each day. The number of active participants ranged from 12 to 24, with one that had only seven. Even the authors acknowledge that these markets could only be described as being “thin”. While the participants had access to HP data bases, they did not have access to the official HP forecasts (where available).

The markets were not operated continuously up to the start of the outcome month (or even during that month). This was unfortunate, as we might have learned more about how well (or not) prediction markets incorporate new information to revise market predictions.

Most likely a function of the market thinness (and the double auction market mechanism), the sum of the market prices for each potential outcome (range) did not add up to the market payoff (as it should), and the market prices were not “stable”. This says a lot about the need for a sufficient number of participants (however many that might be). It also says that maybe we do need some form of market scoring rule or a dynamic pari-mutual mechanism, to at least ensure that the probabilities add up the payoff.

The Results

The authors conclude that the results indicate that the HP prediction market is “a considerable improvement over the HP official forecast.” Basically, they’re saying that, because in 6 out of 8 events the prediction market error was smaller than the error of the official HP forecast, the prediction market outperforms the HP official forecast. It is true, but we need to take a closer look at the data.

In virtually every case, the prediction market forecast is closer to the official HP forecast than it is to the actual outcome. Perhaps these markets are better at forecasting the forecast than they are at forecasting the outcome! Looking further into the results, while most of the predictions have a smaller error than the HP official forecasts, the differences are, in most cases, quite small. For example, in Event 3, the HP forecast error was 59.549% vs. 53.333% for the prediction market. They’re both really poor forecasts. To the decision-maker, the difference between these forecasts is not material.

There were eight markets that had HP official forecasts. In four of these (50%), the forecast error was greater than 25%. Even though, only three of the prediction market forecast errors were greater than 25%, this can hardly be a ringing endorsement for the accuracy of prediction markets (at least in this study).

Without doing the math, it appears that there is a stronger correlation between the predictions and the HP official forecasts than there is between the predictions and the actual outcomes. But, to make the case for prediction market accuracy, the correlation has to be significant with respect to the actual outcome. It was noted in the study that, in several cases, there was evidence to suggest that the official forecasts were based, in part, on information gleaned from the prediction market exercise. Perhaps this explains the correlation with the HP official forecasts. It appears that many of the participants were also involved in setting the official forecasts. To the extent that they may have dominated the trading in the prediction markets, it is not surprising that the predictions would be closer to the official estimates than they would be to the actual outcomes.

Interestingly, in using the prediction markets to make forecasts, rather than using all of the trades, the authors chose to determine several forecasts based on the last 40%, 50% and 60% of the trades. They argue that the latest trades are more likely to be at or near the equilibrium. Yet, one of their observations is that there were no significant trends in trading (they looked at each 10% of the trades). They speculate that the market quickly aggregates a prediction, with subsequent trading moving the prediction around the equilibrium. If this is true, it makes little sense to exclude any of the trades from the determination of the prediction. Arguing from first principles, we would never want to exclude any trades, because it would interfere with the offsetting of trading errors. Excluding trades means we are excluding the information attached to those trades, which runs counter to the theory behind prediction markets.

Though the prediction market results were “better” than the HP forecasts, some markets were better than others. It would have been nice to know why this happened. To be useful, prediction markets will have to be consistently better performers than other forecasting methods. From this study, we aren’t able to make this conclusion. Unfortunately, the authors don’t delve into this issue.

Perhaps the sleeper conclusion is result 2: The probability distributions calculated from market prices are consistent with (those for the) actual outcomes. This is truly useful information. It gives us a measure of uncertainty or risk. Traditional forecasting methods do not provide this information (at least not objectively). Decision-makers can use this information to focus their efforts more wisely to reduce the uncertainty or more fully develop contingency plans where the uncertainty is greatest.

When I look at the graphs of the distributions, they appear to be fairly widely dispersed, rather than tightly focused around the mean. I’m guessing that the relatively small number of participants and the short trading period had something to do with this. It would have been nice to experiment with longer trading periods and greater numbers of participants to see whether this would have reduced the variance around the mean. It would also have been useful to keep these markets open, so that we could see how the distributions changed as they got closer to the outcome being revealed. After all, one of the major benefits of prediction markets is that they are able to dynamically update predictions.

Result 3 is valuable as well. They argue that the prediction markets were particularly good at predicting whether the actual outcome would occur above or below the HP official forecast. They looked at the direction the distributions of the prediction outcomes were skewed to predict whether the actual outcome would be higher or lower than the HP official forecast. It worked. In all cases they were able to make the correct prediction. Given that the official forecasts were usually wrong (as is the case with most forecasts), knowing whether the actual outcome is going to be higher or lower than the official forecast reduces the error (uncertainty) by at least 50%. There might be something to this analysis, at least for HP’s forecasting. It would be interesting to see if this holds up with other prediction market results. Too bad, no one seems to be looking at this.

My Conclusions (so far)

Run a lot of prediction markets, using a variety of participant sizes, to determine the effects on liquidity, prediction distributions, accuracy and speed of prediction. We need more than a sample of 12 prediction markets. We need more than 7 – 24 participants in each market. Keep the markets running after the initial prediction is determined, so that we can see how the market incorporates new information and how more accurate the prediction becomes. Perform more detailed post-mortem analyses. We need to know why the participants made their trading decisions. We need to know when the market has reached an equilibrium.

Run prediction markets on lots of different things. We need to figure out why some markets are more predictable than others.

Posted by: Paul Hewitt | April 10, 2009

Testing Prediction Markets?

Chris Masse (Midas Oracle) commented on this article, (Putting Predictive Market Research to the Test), calling it “truly bizarre research.” He’s right. It’s not a test of prediction markets at all.

I’m hard pressed to figure out where to start in critiquing this “research”. So, let’s begin with the fact there was no prediction market involved. Instead the researchers asked the participants what they thought their peers would do and compared the result with what the participants said they would do. Without a prediction market to aggregate the responses, we really have two polls going. Given the low cost of operating a real prediction market, why was one not used?

 

Next, we have the fact that all of the participants are oncologists. I think it is safe to say that this is a fairly homogeneous “crowd”, highly likely to be deficient in diversity, a pre-condition for prediction markets to operate effectively. The problem with using such a homogeneous group of participants is that many (most?) will have the same “pieces” of the puzzle to be determined, rather than having a diverse group that has many more pieces (however small) that would be aggregated into the outcome prediction.

 

There was no actual outcome in the study. It was a hypothetical treatment. The study’s authors draw conclusions about participant behaviour that are irrelevant. It seems to depend on the oncologists’ personal treatment approaches and opinions and the order in which the questions are posed. All the more reason for using a larger, more diverse “crowd”. They argue that in some cases, the predictive market result was “more optimistic” than that from the individual responses. In other cases, this wasn’t so. One result may have been more optimistic than the other, but which was more right? With no actual outcome, we will never know from this study. The authors note that traditional, survey-type responses, about what someone says they will do and what they actually do, are usually heavily discounted (as much as 50%). In short, such responses are unreliable. To compare the “predictive market” responses with these traditional responses, as they did in this study, is kind of ridiculous.

 

The study indicates that the predictive market results had “tighter” distributions, and concluded that fewer participants could be used to generate predictions (thus would save money in the future). False. Just because the distribution is tighter, does not mean you can use fewer participants. The more homogeneous the group, the tighter the distribution. A very small group may have a very tight distribution (or it may not). Furthermore, you really do need a “crowd” to run a prediction market. Optimally, we don’t want a “manufactured”, “tight” distribution, we want a good estimate of the true distribution.

 

Next time, they should run a real prediction market on a potential new treatment and compare the prediction with that obtained using a “traditional” forecasting method. Both predictions would be compared with the actual outcome (once known), to determine which provided the better predictive accuracy. That would be a true test of prediction markets.

Posted by: Paul Hewitt | April 5, 2009

Practical Enterprise Prediction Markets

Lately, there has been a lively discussion on-line regarding the slow adoption of prediction markets in the corporate world.  It seems that the major researchers and academics believe that it is just a matter of time until the corporate world wakes up and sees the incredible value of these markets.  Others, like Chris Masse (Midas Oracle), are more than a bit skeptical.

At first, I was very optimistic about the value of prediction markets and their eventual highly esteemed place in the corporate forecasting world.  The logic behind the basic theory of prediction markets makes a lot of sense.  You take a “crowd” (lots) of people, each with his own set of information and opinions, let them make choices (independently), and aggregate those choices.  Each person holds a piece of information with an associated error factor.  The law of large numbers ensures that the aggregated error will be quite small, leaving a combined chunk of “information” that is better than any individual’s piece of information.  Designing sophisticated markets would be able to reveal not only the most likely forecast outcome, but also the expected distribution of outcomes (or uncertainty).  And, all of this could be done very cheaply.  Seemed like a sure winner to me.

There were two major stumbling blocks in the corporate areana – anti gambling and insider-trading laws.  These are still issues, but I won’t get into them here, because I don’t think these were the main reasons holding major corporations back from incorporating prediction markets in their forecasting processes.

Prediction markets have been available for many years, yet the number of publicized, successful implementations is really quite small.  Many have been run as short-term “pilot” projects, which rarely seem to achieve a permanent place in the corporate forecasting process.  When you consider that most of the major international consulting firms (McKinsey, et al.), leading academics/consultants (Hanson, et al.) and several prediction market software providers, it is really quite amazing that there are so few bona fide enterprise prediction markets.

Here are my thoughts as to why they haven’t caught on:

Failure to follow First Principles

Unless firms (and their consultants) fully understand all of the prerequisites (first principles) for proper functioning of a prediction market and make sure the implementation addresses all of these requirements, the market is more likely to fail or provide inaccurate predictions.

For example, prediction markets need a large number of participants (and diverse ones at that).  Several academics have come up with innovative methods of facilitating trades through market maker mechanisms.  These have provided market liquidity that allows prediction markets to function (i.e. facilitate trades) even with a relatively small number of participants.  It is a neat little “trick” to make the market seem larger than it is in reality.  Unfortunately, the market maker mechanism allows the “crowd” prerequisite to be violated.  In addition, a smaller crowd lessens the diversity of the participants, at least partially undermining another key prerequisite.  As a result, a smaller crowd has the distinct potential to compromise the accuracy of the predictions.

The various market maker mechanisms also introduce a market distortion, which influences trading behaviour.  More work needs to be done on this, but it is my belief that market scoring rules create highly lucrative potential trading opportunities.  Combined with a “play money” market (where there is little to lose), I believe this creates disproportionate incentives for traders to undertake very risky investment decisions.  Few companies operate with a high risk profile, which calls into question the use of predictions based on risk-seeking traders.

It is interesting to note that the various software providers promote the ease of getting started in prediction markets.  True, it is easy to set up a market using the software.  The difficult part is making it function properly.  The software is merely a tool for aggregating the traders’ opinions.

Public Nature of Forecasts

Judging by the types of enterprise prediction markets that have published results, it appears that many companies have not been focusing on serious, high value forecasting issues.  Perhaps it is the public nature of the resulting prediction that is holding them back.

For example, in many cases, management has a vested interest in creating a forecast for the “market” that may not bear much resemblance to the “true” forecast.  The (“public”) existence of the “true” forecast would undermine their promotion of the official forecast for public consumption by the markets.  A bad situation, I know, but there is more than ample evidence that this is widespread phenomenon.

Existing forecasting practices utilize senior management and consultants to determine the official forecasts.  This group of strategic planners can be trusted to keep the forecasts confidential.  Prediction market forecasts are much more widely known throughout the company.  Most often, the forecasts are based on what they need to show, as opposed to what they might reasonably expect.  Then, of course, the forecast (budget) is pushed down to the lower levels to do whatever is necessary to hit the numbers.  As we see (rather frequently), this often results in many seriously wrong actions taken within companies.

If management is mildly concerned about prediction market results becoming public, it is highly unlikely that they will tackle the most important forecasting issues in this manner.  Perhaps the best way to break into the market is to operate in parallel with existing forecasting methods until prediction markets prove their worth and companies figure out how to minimize the public disclosure of these forecasts.

Practical Usefulness Issues

In order for companies to incorporate prediction markets into their forecasting systems, they need to prove their usefulness.  I think it is obvious that prediction markets have the potential to be extremely useful in this regard, but it is all in the implementation.

As discussed above, software companies make it sound so easy to implement a prediction market, but that is only a small part of the process.  There are a number of major issues that make it difficult to implement effective prediction markets, and the literature has not been particularly useful in resolving them.  While many of these issues have been raised in the literature, the discussions have been very general and sorely lacking in the practical implications.  I guess that’s where the consultants come in, but it also means that a great deal of education is required in order to “sell” the concept.  This needs to change.

Advance predictions & Incentives

In order to be useful, an accurate prediction must be determined well in advance of the actual outcome.  It makes little sense to run a market where you obtain the prediction just before the actual outcome occurs.  This sounds obvious, but it is actually quite difficult to achieve, because traders want to know how their “investment” (bet) turned out, fairly quickly.  This runs counter the the corporation’s need to know the prediction in advance.  So, innovative incentives have to be designed to encourage traders to adopt patient investment strategies and be rewarded for investing in longer-term outcomes.  Not only do they need to make investment decisions well in advance, as new information becomes available, they have to be encouraged to continue trading in the market.  This provides corporations with dynamically updated predictions, which yield valuable information on trends, level of uncertainty, and may indicate the strength of various factors influencing the outcome.

Sufficient, appropriate traders

As discussed above, companies need to have a sufficient number of traders for each market, to ensure that the “crowd” prerequisite is met.  Where necessary, these traders will need to be trained in trading on prediction markets, and the incentive systems need to be explained (and preferably tested), to ensure appropriate trading behaviour is encouraged.

Focus on Valuable Variables

Management needs to determine those conditions, events and actions (variables) that are most valuable to predict, and they must know what to do with the resulting prediction when it is determined.  Again, this sounds obvious, but it isn’t something that can be determined in a few minutes (as suggested by several of the software providers).

Dynamic Analysis

One of the major benefits of prediction markets over other forecasting methods is that they provide a built-in mechanism for continuously updating their predictions.  Assuming the appropriate incentives are in place to promote continuous trading, movements in the prediction over time provide valuable information to management.  Similarly, the distribution of “investments” in the prediction options provides a measure of uncertainty in the outcome, and changes in the distribution will indicate changes in uncertainty, providing management with an early warning system for evaluating forecasting issues.

Bottom Line

I do think that enterprise prediction markets will eventually reach a tipping point, but a lot of work needs to be done.  The academic literature is good, but it is becoming too technical and theoretical.  This has to scare the corporate types.  The focus needs to be on practical implementation issues.  It needs to get away from sweeping generalizations with respect to implementing prediction markets.  Consultants need to step up and focus on rigorous implementation planning that never forgets the first principles that make prediction markets work.  Then, we can be useful helping forward thinking executives run their companies better.

Comments?

Approaching business problems differently.

My response:

Hi Jed…

I agree with your summary. In addition, I think prediction markets offer several additional benefits, including: faster predictions, continuous updating, a measure of uncertainty surrounding the prediction, and reasonable cost. I’m sure there are a few more, but for now, this should do.

I am strongly in favour of using prediction markets to complement existing forecasting methods, especially where they are able to quantify uncertainty. Your example of project milestones is another excellent use for prediction markets.

Although your research indicates that as few as 15 participants can achieve calibrated results, I am skeptical. All of the market maker mechanisms will ensure that there is market liquidity, but I believe they influence trading behaviour (generally making traders quite risk-seeking), which undermines the accuracy of the predictions. Such market maker mechanisms were designed to allow markets to operate with smaller numbers of participants, but doesn’t this degrade the “crowd” precondition for successful market predictions? You aren’t likely to have a very diverse group with a small number of participants.

The basic theory behind prediction markets is that each trader has a piece of information combined with an error factor, and the aggregation method adds the pieces of information together, with the error factors cancelling out (more or less), resulting in more, accurate information. Having a smaller number of traders not only means having fewer pieces of information to aggregate, but also, the error factors will not “cancel out” properly.

As I see it, one of the major stumbling blocks is getting enough people to be interested in each market (and staying interested). The greatest benefit of prediction markets comes from forecasting the outcome as far in advance as possible, but this runs counter to the traders’ need to know the outcome “immediately” (or at least soon). You can see this in most marketplaces where very few people trade in long-term markets. Most trade in the markets that will close in the next few hours, days or maybe a week. These very short term market predictions have little value to a decisionmaker.

If prediction markets are to gain more widespread acceptance in the business community, they will need to focus on ways to ensure their predictions are useful. That is, by providing predictions well in advance of the outcome, accurately (diverse crowd, properly motivated) and with a very reasonable cost.

Much more work needs to be done in the area of motivating traders.

Just my thoughts for now.

Posted by: Paul Hewitt | March 14, 2009

Measuring Market Entropy in Prediction Markets

I came across this blog entry on Inkling Markets’ new support site:

Measuring Market Entropy by James Hilden-Minton, Ph.D.

He proposes using an entropy metric to measure market uncertainty in prediction markets.  While it is an intriguing idea to come up with a measure of uncertainty in these markets, I’m not sure whether this is the one that will do the trick.

Here is my response.

I, too, would like to see a metric to track the uncertainty surrounding market predictions.  The concept of entropy has some appeal, but I’m not sure how it might be applied in a prediction market marketplace.

 

Theoretically, market entropy will start out high and approach 0 as information is incorporated into the market, but very few markets actually achieve a 100% likely outcome before trading is suspended.  So, there will always be a positive entropy metric for a market.  Even where the market is almost, positively, certain of an outcome, there will be a fairly high entropy metric (relative to the range of entropies between the minimum and maximum values).  How do we determine how much entropy is too much?

 

If you want to compare entropy across a variety of markets, you would need to standardize the metric.  I imagine this might require using logarithms with the base equal to the number of possible outcomes (binary market = base 2, American Idol = base 36?!). This will ensure that the maximum entropy possible in every market is 1.000.  Then, we might compare entropies between markets.  But this, too, has problems. 

 

Consider two markets, one binary, the other has three outcomes.  If the binary market uses log2 and the odds are even, entropy would be 1.000.  If the 3-outcome market uses log3, with even odds, the entropy would also be 1.000.  We should be able to compare the relative entropies of these two markets.  Now, after some trading, the traders sell off all of one of the three outcomes, leaving only 2 outcomes.  Now, the market is, essentially, a binary one.  If the remaining odds are 50% for each outcome in both markets, the measure of uncertainty should be the same, but they aren’t.  The 3-outcome market now has an entropy of 0.631 vs. 1.000 in the binary market.  Part of the problem is that the initial odds are not equal for each possible outcome (and they rarely will be).  You can carry this analysis further by making the odds for the two outcomes the same, in each market, and comparing the resulting entropy metrics.  For example, at 10% / 90% the entropies are:  0.469 (binary market) and 0.296 (3-outcome market).  The level of uncertainty is virtually identical, but the entropies are quite far apart.

 

Perhaps the answer is to track entropy from the initial entropy metric when the market opens with management’s best estimates of the initial probability distribution.  As the market moves the distribution, the entropy would decrease (hopefully).  This may provide a measure of decreasing uncertainty.  It may be able to show that the market is helping to decrease the measurement of uncertainty, relative to management’s best estimates.

 

There is also a problem with using entropy to determine when the market has incorporated all information (an equilibrium?).  For example in a binary market, entropy will be at its maximum when the odds are 50:50.  A slight change in odds will produce only a very slight reduction in entropy.  The problem is that this is almost the most uncertain condition, yet the entropy would be “flat”.  The entropy will change the greatest amount as one of the outcomes becomes most highly favored by the market (and there is less uncertainty surrounding the outcome).  Think of the US election, where there was only a small difference between the Democrat and Republican votes, yet it was a landslide.  The entropy would have been quite high, yet there was a significant “certainty” to the prediction.

 

It is, however, an intriguing concept that should be explored further.  My preference would be for a statistic that measures the dispersion of the probability distribution of outcomes, which could be tracked and compared between markets.

 

Just a few thoughts from a non-mathematician.

 

 

Posted by: Paul Hewitt | March 12, 2009

Fallacy of Economic Forecasts

Economists are too well-known for their wild forecasts of future economic conditions.  Just today, I reviewed the Economists’ poll of economic forecasts for many of the world’s economies.  Projections of the percentage change in real GNP ranged from minus 1% to minus 4% (except for Japan at minus 7.6%).  At least the poll is showing negative growth for 2009 and 2010!  A short while ago, these projections were showing modest gains!

Anecdotally, at least in Canada, we are seeing most of the population cutting back on their spending by substantial amounts (not 1 – 4%).  The cut backs are not confined to major expenditures, but include the smaller items in a household budget, like dinners out, entertainment, vacations, etc…  Among those people that I know (rich, poor and in-between), everyone has been cutting back their spending by a minimum of 10%.  I live in an area that used to receive 5 – 10 ad mail pieces per day.  Now, it is down to one or two.  Small businesses are putting a hold on their advertising to local residents.  Could it be that consumers are deferring their purchases no matter how good the product or price?  For the most part, I think this is the case.  I think it is safe to say that the true drop in economic activity is far in excess of 4%.  But this reduction is not reflected in the economic forecasts.  Why is that?

Well, so far, we have not seen the full effects of the downturn in consumption, as businesses are simply depleting their inventories.  The replacement orders will not be coming as quickly or to the same extent as they were a year ago.  When this starts being felt by the manufacturers, watch for another round of layoffs and terminations.  Watch for prices to fall, as businesses try everything to generate cashflow.

Economists use forecasting models to predict the future.  These models have numerous assumptions built into them, based on historical relationships and trends.  We are in a completely new situation in today’s economy.  The old assumptions simply do not apply the way they used to (not that economists’ forecasts were particularly accurate before)!

Take a look at the stock markets.  Their indices have dropped substantially over the last year.  In very simplistic terms, theoretically, stock prices are based on discounted future cash flows and risk.  Certainly, the economy has become riskier, and firms’ prospects of generating cashflows have decreased.  The “market” appears to have estimated that cash flows will be decreasing by as much as 40%.  This, too, doesn’t jive with the economists’ projections.

Perhaps a more likely reason for “rosy” economic forecasts is that they don’t want to be too pessimistic and they do like to stay within their comfort zone in terms of past predictions and the predictions of other economists!  If they were too pessimistic, perhaps consumers would become collectively depressed, adding fuel to the fire.

My advice, keep your eyes and ears open to guage for yourself where the economy is headed.  Until you start to see real people becoming optimistic and starting to spend on discretionary things, we will continue to be in a severe recession.

Clearly, to be successful, an EPM must be more accurate than other means of forecasting, given the cost of setting up and running the market (i.e. benefits > costs). Also, it has to provide predictions for conditions, events or actions, sufficiently in advance, such that the corporation may take action to mitigate losses or take advantage of expected opportunities. The corporation must be able to change course, if the prediction market indicates that this would be wise. If the enterprise is unable to act, even with better information, the value of the prediction is minimal.

A major determinant of an EPM success should be their ability to measure uncertainty regarding predictions of future conditions, events and actions. This would allow corporations to focus their contingency plans (including hedging, insurance, etc…) in the areas most likely to require them (and avoid wasting resources on unlikely future situations).

Of course, the prediction markets must operate effectively, meaning they must have a “crowd”, that has diversity, independence and is decentralized. Ideally, successful markets should not exhibit risk-seeking behaviour by the participants, as most prediction markets are employed to reduce decision-making risk. I fear that most prediction markets employing market scoring rules provide substantial incentives for participants to become risk-seeking. As a “work around” for a failure to attract a “crowd”, MSRs are good for creating liquidity, but not so good for obtaining accurate predictions.

Just a few comments for now.

Older Posts »

Categories