What is calibration in prediction markets?
Short answer
Calibration in prediction markets means that when a market prices an event at a given probability, that probability should reflect how often similar events actually occur. A well-calibrated market is one where events priced at, say, seventy percent happen roughly seventy percent of the time, and events priced at twenty percent happen roughly twenty percent of the time.
What to know
Calibration is a measure of accuracy at the aggregate level, not a prediction about any single event. Because any individual outcome is either yes or no, you can only evaluate calibration by looking across many events at once. If you group together a large number of markets that were all trading near the same probability, and the real-world outcomes match that probability rate, the market is said to be well-calibrated.
This concept comes from forecasting and statistics but applies naturally to prediction markets. In a market, the price is set by the collective judgment of many participants who buy and sell shares based on their beliefs. When participants are knowledgeable and the market has good liquidity, the resulting price tends to carry meaningful information about likelihood. Calibration is the test of whether that information is trustworthy.
Poor calibration shows up in two directions. A market is overconfident if it prices events at extreme probabilities more often than outcomes justify. It is underconfident if it prices events near the middle even when outcomes tend to be more decisive. Either pattern reveals a systematic bias that traders or analysts can study and potentially exploit.
Calibration is also a way to compare different forecasting systems. A prediction market can be compared against expert panels, polling averages, or statistical models by examining calibration curves over time. This makes calibration one of the more rigorous tools available for evaluating whether a forecasting source is genuinely informative.
Key points
- Calibration describes how closely stated probabilities match actual outcome frequencies across a large sample of events.
- A single market outcome cannot prove or disprove calibration. You need many resolved markets at similar price levels.
- Well-calibrated markets are neither systematically overconfident nor systematically underconfident.
- Calibration is separate from resolution. A market can resolve correctly on a specific event while still being poorly calibrated overall.
- Prediction markets are sometimes cited as more calibrated than traditional polls or pundit forecasts because participants have financial incentives to be accurate.
- Calibration can drift over time or vary by topic, so it should be evaluated continuously rather than assumed.
How it compares
- Polls measure stated opinion or preference. They are not designed to produce calibrated probabilities and often lack the incentive structure to correct overconfidence.
- Weather forecasts are a classic example of calibrated probabilistic prediction. Meteorologists have refined their models over decades, and their probability statements are regularly tested against outcomes.
- Gambling odds from bookmakers embed a margin that makes direct calibration comparisons complex. The posted price is not purely a probability estimate.
- Expert panels and forecasting tournaments produce calibrated outputs when participants are scored on accuracy, but they depend on the quality and diversity of the panel.
- Statistical models can achieve good calibration within the domain they were trained on but may fail when conditions change.
FAQ
How do you actually measure calibration?
You collect many resolved markets, group them by the probability they were trading at around a fixed reference point, and compare the group's predicted rate to the actual outcome rate. Plotting these groups produces a calibration curve, and a perfectly calibrated source produces a straight diagonal line.
Does good calibration mean a market is always right?
No. Calibration is a statistical property, not a guarantee about individual events. A well-calibrated market at fifty percent still expects to be wrong half the time for that group of events. Being wrong on a specific outcome is entirely consistent with good calibration.
Can a market be calibrated but still useless?
In theory, a market that always prices everything at fifty percent would be somewhat calibrated if the true base rate happened to be near that level, but it would provide no useful information. Calibration is most valuable when combined with sharpness, meaning the market is willing to assign probabilities far from fifty percent and still gets the frequencies right.
Why do incentives matter for calibration?
When participants gain or lose money based on whether their probability assessments are correct, they have a reason to invest effort in accuracy. Markets aggregate these individually motivated assessments. Without meaningful stakes, stated probabilities may reflect social signaling or casual guessing rather than genuine belief, which tends to hurt calibration.
Is calibration the same as resolution?
No. Resolution refers to whether a market correctly predicted a specific outcome. Calibration is a property of the system across many predictions. A forecaster could be lucky on individual calls while being systematically miscalibrated, or could be well-calibrated statistically while getting a high-profile event wrong.
What topics tend to have better or worse calibration?
Markets on topics with clear, objective resolution criteria and rich public information tend to show better calibration. Markets on rare, ambiguous, or highly novel events are harder to calibrate because there is less historical base rate information to anchor the price.