Posted by: Paul Hewitt | January 4, 2010

The Future of Futarchy

I’ve been meaning to write this post for quite some time.  While it is an interesting concept, on paper, I’m afraid that the only place you are likely to see futarchy implemented is in a future Star Trek movie (no offense to bona fide Trekkies intended).  And, I’m sure the mythical planet, “Futarchy”, is doomed and Spock will show no mercy towards its inhabitants.  I apologize if I got any Star Trek details wrong – I only watched a couple of episodes when I was a kid.  I only refer to Star Trek to show that this idea of futarchy is “out there” – really out there, actually.

I think Robin Hanson agrees, at least partially, with this assessment.  In his paper, “Shall We Vote on Values, But Bet on Beliefs?”, he explains that rather than use a scientific approach to assessing the viability of futarchy, he uses an “engineering” approach, which merely seeks to determine whether a concept is deserving of further study, prototype development, etc…  Interested readers should probably read Robin’s paper before proceeding.  I will explain the basic idea and assumptions behind futarchy, but many of the details will not be repeated, here.

While this should be a very short read, it isn’t.  Robin Hanson used 20+ pages to explain why futarchy is “plausible” (and continues to be hopeful of its acceptance), and Mencius Moldbug used 7,400+ words to conclude that futarchy is “retarded”.  Many more words were wasted in blog comments.  This started out as a quick post to dispose of futarchy once and for all, but one fault leads to another, and it can be hard to stop.  Anyway, read as far as you like.  The conclusion doesn’t change.

The structure of this post is as follows:

  1. What is Futarchy?
  2. Discussion of the Assumptions that support considering Futarchy
  3. Decision Market Mechanics
  4. Three Scenarios
  5. The National Welfare Measure
  6. Design Issues
  7. Other Considerations
  8. Conclusion

 What is Futarchy?

Futarchy is Robin Hanson’s term for a form of government where decision markets are employed to forecast the likely effect of a proposed policy on some measure of overall welfare, such as GDP+.  If a decision market indicates that a proposed policy is likely to generate a positive welfare benefit (relative to the status quo), the policy is automatically implemented.  Actually, Robin uses the word “immediately” to determine when the proposed policy is to be adopted.  A careful reading of the papers indicates that “immediately” really means the adoption of the policy is “hard-wired” to, or directly follows from, the decision market’s forecast.  Citizens vote for elected representatives, who administer the definition and annual calculation of the welfare measure.  Using decision markets, citizens (speculators) place bets on the likely effects of proposed policies.  In this sense, “We Vote on Values (what to do), but Bet on Beliefs (how to do it)”.

The Assumptions

According to Robin, there are three assumptions that support the concept of futarchy.  Here they are, with a brief discussion of each.

1.     Democracies fail by not aggregating enough available information

Basically, Robin states that governments make bad decisions, largely because they have to appease ignorant voters.  In a democracy, every citizen has one vote, but not all citizens are equal, at least not in terms of the validity of their opinions.  He argues that relevant information exists about whether proposed policies will achieve the desired objectives, but that it is not being aggregated accurately, so that politicians may make more correct choices.  If the politicians knew which policies were unlikely to succeed, fewer of them would be adopted. 

While it is true that the majority of the public are poorly informed and lack incentives to become informed, there is a subset of informed “elites” that would be able to make trades in these speculative markets to aggregate accurate, relevant information.  Robin cites a number of studies that lead to the following statement:

“The straightforward interpretation of this data is that experts and those who are better educated actually know more than the general public about which policies are better.”

In making his case, he comes to the conclusion that the general public is not only ignorant, but “fundamentally non-truth seeking” as well.  This presents a problem in developing good public policy, unless the uninformed, irrational “chimps” allow public policy to be determined by the informed, rational elites, “such as perhaps academic advisors.”  He cites a few examples of “contrarian” public opinions, such as “52% of Americans believe astrology has some scientific truth.”  I’m almost convinced that informed traders will have a better chance of aggregating more accurate information for policy decisions, but it doesn’t sound very democratic.  It sounds more like marketing of professors’ services and turning “chimps” into “chumps”. 

On balance, I’m going to give this one to Robin.  Many (perhaps most) of society’s problems can be traced to a lack of accurate, timely information.

2.     Speculative markets are the best known method of aggregating available information

This is where Robin does his usual cut-and-paste job, briefly touching on a variety of prediction (and betting) market “success” stories over the years (none recent, by the way).  There are the following examples that we have all seen before (many, many times):  racetrack odds are better than experts, OJ commodity futures improve government weather forecasts, Oscar markets beat columnist forecasts, gas demand markets beat gas demand experts, US presidential betting markets beat opinion polls about 75% of the time, and the granddaddy of them all, prediction markets beat HP official forecasts 6 times out of 8.

The HP results were “better” by an insignificant amount and were heavily dependent on the official forecasting process (read my analysis, here).  The Oscar markets would not have beaten a poll of the actual Oscar voters.  Isolated “successes” in disparate types of markets does not imply that public policy decision markets will be equally successful.  Yet we see, time and time again, this conclusion being reached on the basis of a very small number of diverse information aggregation field studies.

In a recent op-ed article, Robin explains that speculative markets are the way to go, because they are “an exemplary way to collect and summarize information, at least when we eventually learn the outcome.”  More proof that the more often you state something (anything) the more likely it is to be believed (even if it is beyond belief)!  Note carefully, the actual outcome must become known, for the markets to have any chance of aggregating information accurately.  I’ll have more to say about this, below, as it is not as straight-forward as Professor Hanson would have us believe.

One criticism of Robin’s approach to speculative markets is that he seems to believe that a small number of well-informed traders will always counteract the irrational trades of the uninformed.  In the area of public policy, it is quite likely that some issues will have a very, very small number of “informed” traders relative to the “chimps”.  To me, it is not clear that the informed traders will overpower the chimps.   I will have more to say about this later, but for now, I just want to make the point that speculative markets can work well, but not always.  Not only that, but no one, not even Robin Hanson, seems to care much why some markets appear to work while others clearly do not. 

Reliance on this assumption is very shaky and threatens the entire institution of futarchy.

3.     It is easy to identify rich, happy nations from poor, miserable ones, after the fact.

While agreeing that it may not be the best measure, Robin suggests that GDP may be a sufficient metric for measuring policy recommendations, at least initially.  The measurement (or metric) could be refined to take into account other factors that contribute towards national “welfare” (GDP+).  Policies that are expected to improve national welfare should be implemented.  Subsequently, the measurement of national welfare will identify whether policy decisions have been good.  The logic in favour of futarchy is as follows: 

If a statistical analysis indicates that a policy is likely to have a beneficial effect on national welfare, a speculative market would be expected to indicate the same (unless there were other, valid, reasons for this not to be so).  If it is advisable to consider a policy on the basis of statistical analysis (current practice), it should be equally advisable to consider it on the basis of a speculative market (futarchy).

In a very broad, simplistic sense, this assumption may be true-enough to proceed, though it is an open question whether this is the appropriate metric to be used to assess all (or even most) policy proposals.

Decision Market Mechanics

The mechanics of decision markets are not as simple as Robin would have us believe.  Essentially, these markets are attempting to estimate a form of net present value of the expected welfare measure (GDP+) where the policy is adopted and where it is not (status quo).  The difference between the two estimates is considered the expected benefit of adopting the particular policy.  Given the very long time horizon for most policies that might be considered, it is clear that there is  tremendous uncertainty attached to the calculations. It is almost inconceivable that such markets could provide accurate forecasts before any actual policy effects could be identified.

In order to provide the necessary incentives for trading, the market must be capable of settlement.  This is a required characteristic of all prediction markets.  i.e. the actual outcome must be revealed at some point in the future.  Informed traders generate profits by buying low and selling high while the market is open or by buying at a lower price than that in which the market is ultimately settled.  The smartest traders are those that identify and trade on the largest difference between the current expectation and the eventual outcome.  They also know this information before the less-informed traders.  If the market cannot be settled for 20 years or longer, even using some form of indexed security for the payoff, I would argue that the settlement payoff loses its incentive for all but the most patient traders.  We should note that Robin is a bit vague as to the settlement of these decision markets.  We do know that whichever market contains the condition that is not true will be cancelled (i.e. if the policy is  approved, the status quo market is cancelled).  He discusses the possibility of calculating some welfare measure over a 20 year period, using various weights and discounts, and an implied assumption about future values for the infinite time period after 20 years hence.  So, settlement is a long, long way off in the future.

Such markets must continue to trade until settlement.  If not, the very long holding period for almost every decision market, would mean that active traders would be limited in how many markets they could participate.  If they continued to invest in markets before they received any “winnings”, they would, presumably, run out of investment funds.  Most importantly, we would not see the “cream rising to the top”  That is, the best predictors becoming wealthier, relative to the chimps, until a number of markets were to settle, 20 years (or more) hence.  That is an awfully long time to identify the “experts” and give their trades more weight  (in subsequent markets).  It also assumes that they will still be alive and willing to trade.  Traders will tend to be young ones, too, in order to enjoy the benefits of their smart trades.  Perhaps Robin had Associate Professors in mind for his model “elites”. It assumes, too, that they will be equally adept forecasting policy effects for the issues that will arise 20 years hence.

In the op-ed article cited above, Robin clarifies the settlement problem by allowing trading to continue in the market that is not cancelled, which would allow some traders to cash out, without waiting for the final outcome (and payout).  He indicates that, through such trading, the market will continue to improve the prediction (or forecast).  But, who cares?  The policy decision will already have been made.  Any continued trading in the market and the very long wait until the market settles merely determine the final rewards for the better forecasters and the penalties extracted from the dolts.  There are two reasons why this would be a necessary feature of futarchy.  First, assuming the informed traders are able to cash out before the market settles, this will return liquidity to the marketplace for all policy decision markets.  Of course, there will have to be a sufficient number of chimp-chumps available to facilitate such trading.  Second, the futarchy process requires informed traders to distinguish themselves from the uninformed.  Allowing them to do so, in fewer than 20 or so years that a typical market may span until settlement, is the only practical method.

Three Scenarios

Realizing the this concept of futarchy is a bit of a stretch, Robin proposes a gradual approach to adoption, starting with corporate governance, moving on to agency decision-making and finally national governance.  I only make mention of these, to see whether we can dismiss this whole concept at an early stage.

Corporate Governance

Robin describes how corporations are like small democratic governments.  He considers a simple speculative market involving conditional “dump-the-CEO” and “keep-the-CEO” stocks.  If the “dump-the-CEO” price was “clearly” higher than the “keep-the-CEO” price for “90% of the last week of a quarter”, the CEO would be dumped for the next quarter.  It is not hard to imagine that once such a guinea pig corporation experienced one CEO dumping, many more would follow.  The success of a corporation is not (and should not) be dependent on quarterly results.   Such an institution would require a steady stream of increasingly able CEO candidates (who would be able to hit the ground running, on a moment‘s notice).   A continuous learning curve, constant change and massive severance costs would threaten the very existence of any corporation stupid enough to consider such “decision-making”.  Truly shocking in its naivety.

Thankfully, Robin Hanson appears to be well-ensconced in academia, safeguarding corporate America from the havoc this nonsense would create.

The only reason I note this scenario, at all, is that the next level involves agency governance, which would follow “after some successful examples of using speculative markets in corporate governance”.  We should be able to quit right now, but there are 20 more pages of Robin’s paper to plough through, and so, we press on.

Agency Governance

While this paper was written before the current economic recession took hold, Robin cites monetary policy as a prime candidate for using speculative markets to set policy.  Apparently, most agree on the variables to be manipulated to achieve a good outcome, and they agree on the statistics that may be used to determine whether a quality policy outcome has been achieved after the fact.

To counter this proposed application, one need only consider the current (sad) state of monetary economic intelligence among the “elite”.  If a monetary expert, like Alan Greenspan, can be so wrong for so long, what chance do the “unthinking masses” have?

Somehow, Robin believes that all we would have to do is make economic information available to the public, including speculators, and a speculative market would determine which expert to believe, setting an accurate market price and the most appropriate interest rate policy.  Sheer Madness.  It is the equivalent of handing out hammers and nails to a crowd of chimps and expecting them to build a house.

But we continue on… to national governance. Once enough people are living in these chimp houses and driving around in chimpmobiles, the case will have been made for hard-wiring speculative markets to the policy enactment process.

 National Governance (Futarchy)

Elected representatives define a formal measure of “national welfare”, GDP+, and markets would continuously forecast this metric.  As policy proposals arise, new prediction markets would be implemented to forecast GDP+ conditional on the new policy being enacted and another conditional on the status quo.  Once it has been clearly shown that there would be a forecasted improvement in GDP+ (national welfare) under the proposed policy, it would be immediately implemented.

There are so many ways to be scared by this, it is hard to know where to begin. Few, if any, policies are adopted on the basis of a single metric or desired outcome, yet Robin Hanson is proposing that we do just that.  While it is true that he makes provision for the metric to be a composite of a variety of metrics, this doesn’t solve the problem.  Elected officials are in charge of defining the metric and its composition.  One can only imagine the intense lobbying efforts to influence the definition of GDP+ which could hinder the enactment of beneficial policies or promote harmful policies that should not be passed into law.

Invariably public policies have a variety of objectives.  Selecting one metric (even a composite one) to measure the success of all policy proposals is naïve and simplistic in the extreme.  The effect of any particular policy on the metric will not be observable.  The only way to observe the actual effect of a change in policy on the metric, is to hold all other things constant, which is, of course, impossible to do. 

Robin counters that it is only the difference between the two markets that matters.  However, once a policy has been approved, based on the difference between the status quo and the policy adoption decision markets, the status quo market is cancelled.  The policy adoption decision market always has been, and always will be, attempting to forecast total national welfare measure (GDP+) assuming the policy is enacted, which is based on 20 or more years’ of future statistics.  In those intervening 20 years or so, many new policies will be enacted, and every one of them will be expected to improve the national welfare.  What are the odds of such a prediction market being able to accurately (and consistently) predict the actual national welfare that will be determined over a 20 year period? 

I’m sure Robin will counter with the fact that prior to arriving at a policy decision, both decision markets were subject to the same uncertainty about the national welfare measure.  Of course they were.  This only means that both markets must have been equally accurate prior to the policy decision being triggered.  What are the odds?  How could we prove their accuracy?

We don’t have very many long-term prediction markets that can be tested.  David Pennock did look into the issue of calibration of long-term prediction markets on, here, finding that they were, indeed, calibrated.  However, I commented on Midas Oracle about the problems with his conclusion as it relates to decision-making.  To summarize, David Pennock’s analysis looked at the calibration of long-term markets 30 days prior to settlement.  By that time, almost all of the uncertainty had been eliminated from the prediction.  We would be more surprised if the markets had not been well-calibrated.  Unfortunately, those same prediction markets were consistently inaccurate for the vast majority of the time they were actively traded.  They only became “accurate” as they neared settlement, when the actual outcome was about to be revealed. 

Unless prediction markets can be understood and developed to the extent that they are capable of consistently providing accurate predictions well in advance of the actual outcome, they will not be of any use, at all, for decision-making.  If the markets are any indication (and they are), it appears that such speculative markets are not very good at predicting outcomes in the face of uncertainty.  Long-term policy benefits are subject to very high levels of uncertainty.  Consequently, the prospect of relying on these markets to guide policy decisions is dangerous, to say the least.  Chimps, even elected ones, might make fewer mistakes.

The National Welfare Measure

“A very simple definition of GDP+ would be a few percent annually discounted average (over the indefinite future) of the square root of GDP each period. A not quite as simple GDP+ definition would substitute a sum over various subgroups of the square root of a GDP assigned to that subgroup. Subgroups might be defined geographically, ethnically, and by age and income. (Varying the group weights might induce various types of affirmative action or discrimination policies.) A more complex GDP+ could include measures of lifespan, leisure, environmental quality, cultural prowess, and happiness.”

This is Robin Hanson’s description of the national welfare measure that would ultimately be used to assess whether the policies adopted were “good”.  In the design issues section of his paper, he discusses the possibility of basing the calculation on a 20 year period of national welfare figures. 

This is a lovely intellectual exercise Robin has embarked upon.  The vast majority of the individuals that Robin believes would take part in these speculative markets will not have a clue as to how to forecast GDP+, even in the very simple case.  Many will be perplexed as to how to discount future GDP+ figures.  The vast majority will be unable to calculate a square root of anything.   The intermediate complexity definition involves breaking down parts of the metric into sub-groups and applying weights. We’re now down to a wee fraction of the public that might be considered “expert”-enough to make a considered forecast.  But Robin’s not finished, it could be even more complex, involving environmental quality, lifespan, leisure and a host of other highly subjective factors.  Even the best actuaries will have difficulty here.  Continuing, there is no turning back from globalization, so any definition must take into account the effects of policy changes on foreigners (and other countries’ policy consequences to us).  Finally, no country stands still in time.  Demographic changes will have to be built into the metric definition.  The meek shall inherit the earth, but only if they are fully accredited actuaries!

We can’t be too hard on Professor Hanson, after all, it is a noble cause.  It’s just that, as I noted at the beginning, it belongs more in a Star Trek episode than it does an academic paper.  It’s just so out there.

Design Issues

In this portion of the paper, Robin Hanson outlines 33 design issues that might prevent the new institution, called futarchy, from operating successfully.  Some appear to be relatively minor concerns, given the discussion points raised so far, so I will focus on those that appear most crucial.  Note that Robin phrases the issues in terms of objections to futarchy.

The Rich Would Get More Influence 

Should the rich be able to undermine the accuracy of the prediction markets, Robin proposes to tax them more (a market distortion) or limit how much each person can trade in a market (another distortion).  Robin thinks that the market forces will see to it that the rich do not have as much influence as they have now, because they will not have proportionately more or better information than the speculators.  Robin’s belief in market forces is unwavering.  As we shall see later, this is a very naïve view.

One Profits Little by Supporting Unlikely Proposals

Here, Robin considers the case where you think you have a strong proposal, but few others agree, holding down the welfare measure such that the policy is never adopted.  It seems unfair that you never get rewarded for your good policy, and they are never penalized for “your being right.” 

In this case, Robin suggests (and he is probably correct) that all political systems suffer from this problem.  Consequently, it may be possible to get the policy implemented on a smaller, local scale and keep trying to convince others that the larger proposal has merit.  One can only wonder as to who might possess the resources to embark on this course of action.  As we will see, later, there is a large cost of proposing a policy initiative. 

OR… could it be that you are wrong and deserve not to have the policy adopted?  OR…  could it be that the uninformed or the manipulators are able to set the market price with their “incorrect” information?  Robin doesn’t believe it is possible for manipulators (or uninformed “noise” traders) to “game” speculative markets, so it can’t be the latter possibility.  In fact, he goes so far as to say that manipulators make the market more accurate.  Maybe the market is working properly after all by preventing you from “being right.”  Maybe you’re not “right”.  OR… maybe manipulators can game these markets.  I think they can, as explained here, here, here and here.  These references apply to several points that follow regarding manipulation of speculative markets.

Some Markets May be too Thin

Robin considers that some markets may be too thinly traded to arrive at accurate estimates, making it possible for a few traders to push the market to favor a bad proposal.  By assuming that pro and con traders are similarly funded, each will try to influence the market, eliminating the thin market condition.  Alternatively, he assumes that the speculators would find out that one side was willing to manipulate the market and make trades to counteract the manipulation.

As I noted in, my post these are highly unlikely assumptions.

One Rich Fool Could Do Great Damage

Here, Robin considers the case where Bill Gates might try to manipulate the market.  If speculators knew which way Bill Gates was trying to move the market, they could easily counteract his trades, as it is assumed that, collectively, they have much more power than he.  Even Robin agrees that it is more likely that the speculators would allow the price to be pushed somewhat by Mr. Gates, because they would assume that Bill knows something that they do not.

People Could Buy Policy Via Trades

Similar to the “Rich Fool” situation, Robin claims that someone could not buy a policy by making the “right” trades, because other traders will only let prices move when they suspect that this new trader has new (accurate) information.  Robin states that if the other traders, with deep pockets, are able to clearly observe a particular person is trying to manipulate the market, they will not allow the price to change.  Failing to possess such oracle-like market knowledge, the other traders need only know the total quantity and direction of the noise trades in order to make their corrective trades.  Even if the other traders do not know the direction and strength of the manipulation and they are unsure as to whether the manipulator has relevant information, the manipulator’s trades will merely add a bit of noise to the market price.  The sheer weight of the other, informed, traders will nullify the effects of the manipulator’s trade. 

I refer to my posts on manipulative trading, above. 

Corrupting the Welfare Measurement Metric

It is possible that the measurement of the metric that is being forecast could be corrupted to influence the policy decision.  This can be counteracted by having multiple estimates of the metric and using the median estimate as the official one.  I agree, except that, just as we have auditors attest corporate financial statements, we will need appropriately trained, independent, “auditors” to ensure the accuracy of the national welfare measure.

Welfare Metric Definition

The welfare metric must be defined independently from the policy process.  It is a simplified summary of the values voted upon by the electorate.  Government representatives could improperly influence the definition of the welfare measure.  Robin raises the issue in terms of manipulation designed to support a specific policy proposal. 

In addition, there is likely to be substantial lobbying efforts directed at components of the welfare measurement that are detrimental to powerful interest groups.  For example, large carbon emitters and polluters would seek to minimize the impact of their negative externalities on the welfare measurement, which would lessen the likelihood of punitive legislation coming into force.  If we think lobbying is a problem now, just wait.

Defining When a Market “Clearly” Estimates

Basically, this means determining when the market becomes accurate.  Essentially, Robin considers the need for taking a conservative approach, which would require a minimum of one year of a consistently “clear” price differential, followed by a one or two week (continued) price difference for policy approval to become effective.  It is a good idea to make sure that the market consistently indicates a policy will be beneficial before implementing it.  One major problem is that long-term prediction markets are notoriously inaccurate until shortly before the outcome is revealed (as discussed above).  Do we really want to take chances in setting public policy, based on long-term prediction markets that are completely unproven and most likely inaccurate at the time the decision is made?

Institutional Costs

It is costly to evaluate proposals, so there must be a framework to limit the flow of new proposals.  Robin suggests a fee to be paid to have a proposal considered (which would be refunded or rewarded if the proposal is adopted).  The fee might be set at $10 million (or $10,000), but could be reduced by a subsequent policy change proposal.

Interesting that Robin wants trading input from the public, but most assuredly wishes to exclude them from the proposal process.  Only the rich, corporations and special interest groups will have deep enough pockets to initiate proposals.  It ignores the fact that at least part of the responsibility of our government is to identify issues , propose solutions and implement policies for the benefit of society.  Granted, there are precious few examples of governments setting policies to prevent or avert future problems, but how might futarchy make policy setting more effective in this regard?

What about emergency policies?  Surely, these must be exempt from the process.  Assuming they are, what is to prevent the government, the rich, the corporations and the special interest groups from adopting a do nothing policy until an issue becomes so acute that an emergency policy is required?  Well-oiled lobbying machines will kick into gear, giving us the same, broken process for setting policy.

Fixing Bad Decisions

Here, Robin addresses the issue of a “bug” in the welfare function, probably due to oversimplification.  The elected government must have the power to amend the welfare function and/or reverse the policy decision.  Unfortunately, the process may be too slow to avoid substantial harm and it may be quite expensive to undo a policy. 

Robin proposes that once a policy proposal has been approved, it could be vetoed within the next year, if another market “clearly” estimates bad welfare consequences, using the welfare metric as defined in one year.  That is, he’s proposing an appeal process for policymaking.  Those with the deepest pockets will be in control of veto powers (or at least substantial delaying powers).  Lobbyists will have immense incentives to influence the welfare metric.  Business as usual.

It Seems “Hard” to make one Measure Encode all of our Values

It’s not just “hard”, Robin, it’s downright impossible.  You propose a simplified measure, initially, that would be incrementally amended over time, by the elected representatives.  Lobbyist heaven!

Even your most complex measure of welfare is, still, a remarkable simplification of “national welfare”.  Values in one part of the country will be different from those in another, on many key issues.  At best, “national welfare” will be an “average” of the values held by the citizens.  Every policy decision involves tradeoffs, and one could argue that every policy is different in this respect.  Yet, the national welfare definition “hard-wires” the same tradeoffs for all decisions.  This is far too simple.  I’ll stop here, as this could be the topic of an entire book (and we may not ever need to know the “answer”).

Other Considerations

Budget Constraints & Policy Adoption Ranking

Under futarchy, as long as it is clearly shown that a proposed policy would improve national welfare compared with the status quo, the policy is to be adopted.  You don’t have to be much smarter (if at all) than a chimp to understand that no nation would be financially able to implement every policy that met this standard for adoption.  Simply put, there are budget constraints, now and in the future.  All but the simplest of policies involve financial commitments in the future.  Accordingly, policies adopted in the current period will have budget implications in future years, which will limit the ability to adopt future policies that may be proposed (and that should be adopted).  Futarchy makes no mention of budget constraints

Consequently, there must be a method of ranking policies that are slated for adoption, so that the most beneficial policies are adopted ahead of weaker (though beneficial) ones.  Given the multi-year aspect of all policies, there must be some consideration of a policy’s adoption on the budget resources of future years, which may prevent the adoption of future policies (either under futarchy or in emergencies). 

If a policy is slated for adoption, based on the decision markets, but it cannot be adopted under a budget constraint, then both decision markets need to be voided – the policy adoption market and the status quo market.  Futarchy makes no mention of this possibility and the potential effects it may have on the decision markets.  I wouldn’t even hazard a guess at this point.

Complex Trader Forecasting

Futarchy assumes that if all available, relevant, information is made available to the public, speculators will be able to discern fact from fiction and forecast the national welfare measure accurately.  This assumes that at there are a sufficient number of informed traders that have a very good understanding of the issues and information and that they have decision models able to make accurate predictions.   I’m reminded of the super-human, computer-brained, all-knowing beings that I met during neoclassical economic theory classes.  I thought they had died off, but apparently, they’re back!

Forecasting national welfare under futarchy is an incredibly complex problem.  I don’t think it is even possible for speculators to make reasonably accurate forecasts of national welfare.  They simply do not possess the knowledge or understanding, let alone a decision model, that would allow them to make accurate predictions.  Even if the institution of futarchy provides speculators with forecasts and asks them to bet on the most likely one, they still do not have the necessary tools to make that decision. 

If the traders don’t have enough information to make an accurate forecast, the market will not create it.  Prediction markets merely aggregate available information held by the participants, they don’t create new information through trading.  Prediction market proponents understand that each trader’s prediction is an “accurate” estimate combined with an “error” factor.  The assumption is that the errors cancel out, leaving only the accurate information reflected in the market price.  I think this is likely to be true, but not in every case.  Where the individual errors are large, relative to the known, accurate information, the predicting algorithm is likely to break down.  If you were to consider a large number of traders, each with a very small amount of information, it is highly unlikely that the market will function like a jigsaw puzzle, putting all the “good” pieces together and cancelling the “errors”.  The large error factors will prevent any algorithm from generating a reasonably accurate prediction.

For example, if we were to run a decision market for a policy designed to combat global warming, the forecast would be wildly inaccurate.  The participants simply do not have enough information to make a reasonable forecast.  The market will not create any information that is not already possessed by the traders.  Yet, the market will look the same as an “accurate” prediction market.  Even worse, it is not possible to determine whether the market is accurate.


There will always be random events that influence the actual outcome.  If markets are “efficient”, it is not possible to predict the effects of future random events on the outcome, based on the information held today.  Prediction markets reflect the level of uncertainty about the actual outcome by providing a distribution of outcome predictions.  When uncertainty is high, the distribution will be relatively flat.  As uncertainty is reduced, the distribution will tend to be tighter.  No prediction market can fully eliminate uncertainty surrounding the actual outcome being predicted. 

To some extent, the longer the time between the prediction (forecast) and the actual outcome (national welfare measure), the greater the uncertainty.  Consequently, most decision markets are likely to exhibit a fairly flat distribution of forecasts at the time the decision will be made.  While Robin Hanson disagrees with me, I believe that such markets are much more likely to be gamed by manipulators.  Furthermore, even if these markets are well-calibrated, they will not forecast the actual outcome, accurately, very often.

Decision Markets vs. Prediction Markets

Back in May, 2009, Mencius Moldbug posted Futarchy Considered Retarded on his blog, Unqualified Reservations.  It was an interesting smack-down of futarchy.  One point he made (among the 7,400 words) was that prediction markets is a fine idea, but decision markets are retarded.  I found this to be an odd comment, because all prediction markets are decision markets.  His distinction didn’t support his argument, and it clearly confused a number of commenters on his blog site and Robin Hanson’s, Overcoming Bias, when he posted his Reply to Moldbug.

Apart from frivolous applications of prediction markets, they all generate predictions about an outcome, and the prediction is used in some decision model to make a decision.  In this sense, they are decision markets.  Robin Hanson uses decision markets to mean a pair of prediction markets that work together to predict the difference between two predictions.  Typically, the difference is the effect of implementing a particular policy (or decision).  Futarchy goes one step further to hard-wire the decision markets to a hard-coded decision model. 

If prediction markets are fine, so are decision markets, but futarchy is still retarded.

Information Asymmetry

Mencius Moldbug made the following point:

“A prediction market, like any other market, functions only in the general absence of asymmetrical information. It is with some pain that I absorb the realization that a member of the George Mason School is unable to correctly apply this concept. … The rational approach to a market in which other players have more information than you is not to play. … This is one of the many reasons why insider trading is illegal.”

Robin replied, correctly, that virtually every market has information asymmetry, to some extent.  Markets still function, albeit not perfectly.  Only in cases where the asymmetry is severe is it possible that the market will cease to exist, and even then, over time, such markets seek to reform their institutions to alleviate the information asymmetries.  Moldbug’s assertion is a bit naive, relying much too heavily on the theoretical effects of information asymmetry in markets.  It is a wonderful, logical theory, but it is about as useful as the neoclassical framework for analysing real world markets.

“After the Fact it is Quite Easy to Test Forecast Accuracy”

Robin Hanson stated this in his reply to Moldbug. 

I find this to be a surprising statement by Robin.  It is not “quite easy to test for forecast accuracy” after the fact.  This involves measuring the degree of calibration between the market distribution and that of the outcome.  In fact, given the uniqueness of the outcomes being forecast, it is nearly impossible to measure calibration.  The best we can hope for is to estimate calibration of specific types of prediction markets with some set of homogeneous (more or less) outcomes.  Without calibration, a necessary condition, it is not possible to pass judgement on the accuracy of a prediction market.  Simply arguing that because one prediction market (pick one) possessed the calibration condition, all prediction markets must have it, is simplistic, without any support, and just plain dangerous.

Consider also that Robin Hanson is looking at a 20+ year measurement of the outcome for most public policy decision markets, under futarchy.  At best, there is a tremendous time lag (20 years or more) before it would be possible to test the calibration of any decision markets.  Remember, David Pennock’s analysis involved the calibration of markets 30 days before settlement.  To argue that these markets will be as well-calibrated (and accurate) as horse race betting markets is a ridiculous assumption.  Race track bettors at least read a racing form before making their bets.  In decision markets, we are merely pointing the chimps toward the dart board.


Robin Hanson doesn’t really give us his conclusion in the paper, but we can infer that he thinks futarchy is “promising”, based on his handling of the 33 design considerations and the list of next steps in the evolution of futarchy.  Further support comes from his op-ed piece in August, 2009 and his upcoming futarchy debate with Mencius Moldbug on January 16, 2010.

My conclusion is that futarchy has no chance of success, whatsoever.  It is a hopelessly flawed concept, even if its aim is true.  Decision-making, especially public policy decision-making cannot be done properly with such a simplistic process.  Inevitably, important considerations are left out of the decision, leading to bad decisions.

Robin believes that the information necessary to make good decisions exists, but that it has not been aggregated accurately.  I do believe that this is at least partly true.  However, I also think a large portion of information that is needed to make proper decisions does not presently exist.  Perhaps more of our resources should be directed to uncovering the missing information. 

In particular, market prices in the real world do not reflect externalities from economic activity.  Current proposals for a carbon tax or for cap and trade are attempts to include the cost of carbon emissions in economic decision-making.  If successful, either of these policies would have an impact on market prices for all goods throughout the economy, reallocating scarce resources to better economic uses.  Placing values on pollution, fresh water and other critical resources might be a far more important solution to the information problem in public policy decision-making.  That’s my “out there” idea for the decade to come.



  1. At over 7000 words, this is way too long for me to respond to every point. Care to indicate the top three claims you’d most prefer I respond to?

    Your discussion of the simplest case, dump the CEO markets, suggests you don’t understand the mechanics of my proposal. Boards of directors *already* make quarterly evaluations of whether to retain the CEO; I’m proposing to use markets to make that evaluation. I’m not proposing to based that decision on “quarterly results”, nor that CEOs be fired every quarter. If you thought these markets were biased toward dumping CEOs who should be retained, you could expect to profit from trades in those markets. Boards of directors can also make mistakes. What is your basis for thinking boards of directors make fewer mistakes than speculators who put their money where their mouths are?

  2. Here are the three claims I would like you to consider.

    1. Whether it is possible for (very) long-term prediction markets to be accurate, at the time the decision is made (not 19 or 20 years hence, just before the outcome is determined).

    – I commented on this in the Decision Market Mechanics section, Scenario Three about David Pennock’s calibration testing, and in Other Considerations, where you stated it is “quite easy to test…)

    2. How you expect traders to be able to understand and analyse GDP+ forecasts and the effects of particular policies on this metric, to the extent necessary to be able to improve the accuracy of the market forecast.

    – Governments (and teams of economists) have a difficult enough time measuring GDP for the current period, let alone one 20 years from now.

    – see my comments about “Complex Trader Forecasting” in the Other Considerations section.

    & a toss-up between:

    3A. Why budget constraints and policy adoption ranking was not considered? (First point in the Other Considerations section)

    3B. Wouldn’t the hefty policy proposal fee create a serious threat to “democracy”? It appears that this would marginalize the vast majority of the populace and create a policy making institution completely dominated by big money interests. (see “Institutional Costs” – last point under Design Considerations).

    As for the dump the CEO market case, your paper does imply that the board would (or should) dump the CEO based on a quarterly decision market. Perhaps I was reading a bit of “futarchy” into these markets (the hard-wiring part). If the board fails to act on the decision indicated by the decision market, why bother with it at all?

    Given the short-term nature of these decision markets, I agree that they might provide useful information for boards, but if they aren’t used by the boards to make actual decisions, what’s the point?

    Even though the post is very long, I tried to cover most of the key points raised in your paper. When Mencius Moldbug attacked futarchy in his blog, his focus seemed to be mostly on the mechanics of prediction markets. He may not make the same mistake in your upcoming debate.

  3. […] Robin Hanson would debate Paul Hewitt, instead” of Mencius Moldbug.  Monday Paul posted a 7000+ word critique of futarchy.  I commented, “Care to indicate the top three claims you’d most prefer I […]

  4. I responded here:

    • I replied to your comments on your site, here:

      As the comments have died down on your site regarding this topic, I wish to assess your response to my questions.

      Robin, you only really answered question 3A, and even that answer is difficult to take, given that you didn’t answer questions 1 and 2. I mean, not only do your “informed” traders need to have a pretty good understanding of the GDP+, the methods of forecasting GDP+, the policies being considered (now and in the future), and an long list of other considerations, they also need to be expert in government budgeting. Simply stating that they would be able to make the necessary adjustments to the forecast of GDP+ is not sufficient support for the proposed futarchy. Arguing that the status quo is equally unable to do this is not a good enough reason to try futarchy (i.e. both may be really bad).

      You avoid answering questions 1 and 2 by claiming that I am under the mistaken impression that in these types of long-term decision markets, accuracy falls to zero and that I think there is an absolute standard of accuracy necessary for decision markets to be useful. Consequently, the only “standard” is whether decision markets are better than the status quo institution. I’m not under any mistaken impressions, here. I agree with you. However, just as you are unable to measure the accuracy of these types of decision markets, you are not able to measure the “accuracy” of the existing policy making institution.

      Your only answer to the long-term accuracy of prediction markets seems to be that they work “fairly well” in the short term, so they will probably work reasonably well in the long-term. I don’t think it is reasonable to make this assumption. As you know, I disagree with the statement that they work fairly well in all short-term markets, too. They work well for some markets, not all.

      Question 2 was really getting at the “brain” of rational expectations, which is a condition of the EMH, which provides the basis for prediction market accuracy. Under the EMH, traders seek new information up to the point where the expected marginal cost of the new information equals the expected benefit from trading on the new information. Thus, the market provides the incentive to become better informed. While this incentive mechanism appears to work fairly well in most stock markets, I seriously doubt it will work as well in futarchy’s decision markets. There are no established models to forecast GDP+ and relevant data is not plentiful (nor is it reliable or accurate). Consequently, the costs of acquiring new information will be steep. Unless the traders intend to bet (oops, I mean “trade”) a significant sum, the marginal cost is sure to exceed the marginal benefit for each and every trader in the market. Such markets will degenerate into “Galton’s ox” guessing games. You might as well take a simple average of all the economists’ GDP+ forecasts to obtain your predictions.

      Your argument for supporting futarchy boils down to this: if there is a possibility that decision markets could work, they should be tried. I disagree. There are costs, significant ones at that, that would make this a bad decision (in my opinion).

      • You claim that markets don’t always “work”, i.e., provide accurate estimates, isn’t in conflict with my claim that they have typically been as accurate or more accurate that other institutions for short-term forecasts. I also claim that our data on financial markets show that while their accuracy gets worse further into the future, and for more complex topics, it doesn’t fall to zero – prices still contain info. It seems you are now saying you don’t disagree with this claim. I also claim that prices also contain info, with non-zero accuracy, even when there “are no established models”; not sure if you accept this claim or not.

        Innovations always reach a point where they seem to work well on something small and one needs to decide if to try them on something bigger. You seem to say that since I can’t yet prove the accuracy of these markets compared to existing institutions for long-term complex forecasts, you doubt they will, and don’t see the point of trying to see. It seems to me that a general attitude of this sort would reject ever trying any new thing. If you don’t want to embrace that position, you need to offer a way to distinguish new things that have worked well on a small scale that are worth trying, from those that are not worth trying.

  5. You let Robin off easy. Here are the questions I would ask (and just posted on his blog comment thread):

    1. Using Blackwell’s theorem, Joel Demski has proved that you can’t identify an optimal accounting standard (such as a definition of GDP) without reference to the decision-maker’s preferences. Since citizens’ preferences differ, it follows that there is no way to identify a unique optimal metric to serve as the number being predicted in a decision market. Why wouldn’t a futarchy devolve into political battles over security definition? What would be the value of having an accurate prediction of a number that almost everyone would agree is not sufficiently germane to the policy being considered (though they would agree for different reasons)?

    2. Prediction markets don’t generally allow for any surplus gains to trade–one person’s gain is another’s loss. Since becoming informed is costly, under traditional models of rational trading, trading volume would be zero OR the market is populated with irrational traders. So is futarchy founded on the notion that widespread trader irrationality is the engine that results in market prices that are more rational and informative than would be achieved through traditional political processes? If so, how do you address the large volume of research (both theoretical and empirical) that demonstrates that large volumes of irrational trade often keep rational traders from eliminating price errors through arbitrage? (See this for a model and this and this for couple of my own laboratory prediction market experiments in Review of Financial Studies.)

    3. Combining 1 and 2, Why is it reasonable to assume that decision markets are robust to conflicts of interest? In the spirit of 1, what if the metric is GDP, and I believe it will rise slightly more under policy A than B, but my own industry gets hammered under A and benefits dramatically under B. If I engage in strategic trade that distorts market prices, and therefore policy selection, where is the guarantee that others will be in a position to discipline me, and do so symmetrically? Note that this is far trickier than just establishing that markets can discipline traders who are misinformed (which is “all” it takes to answer 2).

    I would add a fourth question as well:

    4. The world is full of prediction markets, but as far as I know, decision markets are restricted to traditional voting–including markets where people put their money where their mouths are, whether in corporate elections or American Idol pay-per-vote settings. What are the most promising opportunities for a decision market such as you describe, and why don’t such markets already exist?

  6. […] Debate is raging between Robin Hanson and the futarchy critics Written by Chris F. Masse on 2010/01/08 — Leave a Comment – Robin Hanson comments on Paul Hewitt’s blog. […]

  7. I agree with most of your comments, Robin, particularly as they relate to financial markets. However, the types of markets contemplated under futarchy differ from financial markets in terms of their informational efficiency (at least that’s my opinion).

    I’m not saying that the information accuracy drops to zero, but I am saying that it drops to a very low, or at least an unreliable level. Consequently, for the long-term markets, the market aggregates a lot of guesses, as opposed to reasoned opinions.

    Perhaps we differ in our beliefs about market efficiency of decision markets.

    In terms of distinguishing between innovative ideas worth trying and others, I draw the line with a reasonable cost-benefit analysis. Rather than have faith that a market will “work”, even when logic tells you it won’t (a la Miracle on 34th Street), I need to know there is a logical reason why it should (or at least could) work. Then, if the potential benefits are likely to exceed the cost of trying, give it a try.

    Maybe, we need to take a closer look at prediction market efficiency.

  8. […] Pacific island nation by now (it hasn’t happened).  About a year ago, I commented on the Future of Futarchy, where I dismissed the concept.  Despite this, I see that in December 2010 Robin Hanson is still […]

  9. […] I didn’t feel the need to update it since, because very little has changed since then!  The Future of Futarchy still attracts a few readers, especially when Robin Hansen teaches his students about his concept […]

  10. […] Paul Hewitt is less keen. He winds up, in the comments, asking Robin three questions. […]

  11. […] the years, multiple well thought-out critiques of Futarchy have been published, largely focused on the diverse ways such a market-based […]

  12. […] the years, multiple well thought-out critiques of Futarchy have been published, largely focused on the diverse ways such a market-based […]

  13. […] the years, multiple well thought-out critiques of Futarchy have been published, largely focused on the diverse ways such a market-based […]

  14. […] the years, multiple well thought-out critiques of Futarchy have been published, largely focused on the diverse ways such a market-based […]

  15. […] the years, multiple well thought-out critiques of Futarchy have been published, largely focused on the diverse ways such a market-based […]

  16. […] the years, multiple well thought-out critiques of Futarchy have been published, largely focused on the diverse ways such a market-based […]

  17. […] the years, multiple well thought-out critiques of Futarchy have been published, largely focused on the diverse ways such a market-based […]

  18. […] the years, multiple well thought-out critiques of Futarchy have been published, largely focused on the diverse ways such a market-based […]

  19. […] the years, multiple well thought-out critiques of Futarchy have been published, largely focused on the diverse ways such a market-based […]

  20. […] Market manipulation, value subjectivity, low participation, measurement of implemented policies (human arbitration), and volatility. All of which have been summed up in an excellent post by the Ethereum blog (reiterated here), with reasoning collected by Mencius Moldbug and Paul Hewitt. […]

  21. […] opposition to futarchy is maximum well-summarized in two posts, one through Mencius Moldbug and the different through Paul Hewitt. Each posts are lengthy, taking on 1000’s of phrases, however the basic classes of opposition […]

  22. […] opposition to futarchy is most well-summarized in two posts, one by Mencius Moldbug and the other by Paul Hewitt. Both posts are long, taking up thousands of words, but the general categories of opposition can be […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: