If you have read James Surowiecki’s book, The Wisdom of Crowds, you know the story of Galton’s Ox. Early in the 1900s, a live ox was put on display at a county fair. People were asked to guess the weight of the ox, once it had been butchered. Some of the participants were considered “experts” at this, others were not. Francis Galton obtained the list of guesses and found that the average was within one pound of the actual butchered weight of the ox. Not only that, the average guess was better than that of any of the “experts”. And so, with this little experiment, the concept of collective intelligence was born.
There have been many similar experiments involving guessing the date when an event will occur or the quantity of something, like how many jelly beans in a jar. Usually, the average guess from a large group will be closer to the true date or quantity than the guess from any one person. There are a few conditions that allows this to happen, as we shall see.
Recently, Facebook announced it’s IPO, and everyone wanted to know what the price would be after it launched. Apparently, a venture capitalist named Chris Sacca suggested that the “crowd” be asked to predict the price. James Proud set up a simple website to collect predictions at Facebook IPO Day Closing Price. The average of the collective guesses turned out to be miserably wrong. It wasn’t even close. This group’s average IPO price (at the close of trading after the launch) was $54. The actual closing price was $38.23. What went wrong?
“Did everything right”
Ville Miettinen wrote a piece, Predicting the Facebook IPO: The crowd gets it wrong, trying to explain why the crowd got it wrong in this experiment. The article claims that the website “did everything right” in setting up the method of collecting guesses, other than perhaps attaching real money to the guesses. There was a crowd of 2,261 guessers. Miettinen claims they were diverse, coming from all over the world, and there was “rampant” discussion on Twitter about the topic. But, collective intelligence requires more than that! They actually have to know something about the subject in order to make an informed guess or prediction!
Miettinen notes that all but three of the 26 who guessed the correct price were non-experts. Curiously, a Google + engineer, tech entrepreneur and a Bloomberg analyst were considered to be “experts” on Facebook’s share price. These wouldn’t likely be examples of “experts”, on this topic, if I were to make the classification. Senior investment bank executives, fund managers, and perhaps, senior Facebook management would be my picks, especially if they had inside information about this particular IPO. One of the features of collective intelligence, especially with respect to prediction markets, is that they can reduce bias in making predictions.
Miettinen claims that “experts”, more so than non-experts, are sensitive to hype. While there certainly was a lot of hype about Facebook before the launch, I would think “experts” to be more immune to hype than your average person. More likely, this group of guessers suffered from the common affliction of herding. When people don’t know very much, they tend to follow the “herd” (do what other people do).
The Herd Tweets
It appears that many of the guessers posted their predictions on Twitter, along with comments about the IPO price. This feature of the experiment may have caused the eventual prediction to be less accurate than it might have been. I’ll come to this shortly, but there is good evidence that this particular group of guessers held very little useful information about the topic. Consequently, publicizing information about others’ predictions and spreading hype would have done nothing more than generate herding behaviour among future guessers. If you think others know more than you, you’re more likely to follow their guesses.
Rewarding the Ego
Also, Tweets provided the proof of making guesses. Once the true price had been established in the market, it would be possible to claim that your guess was right, armed with your previous Tweet. Would you want to be one of hundreds that guessed the most likely price, or would you prefer to be one of a few that picked the outlier price? There was no leaderboard or price incentive built into this aggregation method, so the participants may have tried to win the ego “prize” of being one of a very few that got it right. Even Galton’s Ox contest had a prize. If the experiment designer fails to provide an incentive for making accurate predictions, the predictions will not be as accurate as they might otherwise have been.
Not Even Good Guessers
I must admit, I am completely baffled by the wide range of guesses submitted by the participants. Even if you don’t know much specific information about something, surely you can make an educated guess! Not so in this case. The predicted prices ranged from $29 to $87! I analysed the guesses of the participants, based on the graph provided on the prediction website, and found that the average price was $54.53, with a standard deviation of 11.37. Based on the guesses of these participants, the actual price of Facebook would have been expected to fall within the range of about $43 – $66, with a likelihood of 68%.
The wide range of prices and a large standard deviation indicates that there was a high level of uncertainty in the guesses. Put another way, the participants really didn’t know very much about Facebook, share price behaviour or IPOs. No method of aggregation (such as averaging) will create the information that is not held by the participants. By making a guess, each participant injects his or her information into the model (here, the averaging aggregation method). Since there was very little information held by the participants, the result of averaging was a very poor estimate of the Facebook share price. Garbage-in, garbage-out.
Contrast this result with Galton’s Ox experiment. Even the most uninformed townsfolk would have been able to limit his guess to a reasonable range of choices. An ox weights more than me, but less than my horse. An ox has bones and hoofs that aren’t butchered. Therefore, the butchered weight can be confined to a reasonable range. The result was a relatively tight distribution of guesses. Each guess would have had a smaller error factor than it would have had, had the range been larger. The law of large numbers says that these small errors cancel out, leaving a reasonably accurate average prediction. When the errors are much larger, as they were in the Facebook IPO situation, the errors can swamp the “accurate” portion of each guess.
Another example of a failure occurred when Apple was about to release the iPad for the first time. Nineteen so-called experts tried to forecast the number of iPads that would be shipped. None of them came even close. Their predictions were all substantially lower than the actual shipments. Again, this was a poll and not a prediction market. This situation shows that when there are only a few predictors, the errors can be large and they may be mostly, or all, in the same direction. There is no way for these errors to cancel out.
Miettinen wrote that the participants appeared to form a diverse group, because they came from all over the world and shared a variety of views. Interestingly, based on the Twitter feed of guesses, almost all of the participants were male! I don’t know whether it would have made a difference, having more women involved, but the ultimate prediction could hardly have been worse!
The group of individuals, that were given the task of predicting the Facebook share price after the IPO launch, had very little knowledge of the subject. Even if a few individuals were sufficiently knowledgeable, their guesses would have been ameliorated by the relatively large number of erroneous guesses. There appears to have been a significant herding effect, too. I think we can say that the crowd wasn’t as diverse as it should have been. So, just because something is “collective” doesn’t make it “intelligent”. The method of generating a collective prediction was seriously flawed.
What if James Proud had set this up as a prediction market instead? That will be the subject of my next article.