Will OpenAI's next released model debut with a benchmark score of at least 1520? Market odds: 2% YES. Track AI capability expectations and live trading data.
This market has been archived. Historical content preserved below.
OpenAI regularly releases new models with published benchmark scores that measure their reasoning, coding, and knowledge capabilities across standardized tests. A score threshold of 1520 represents a specific capability target that traders currently view as highly unlikely—reflected in the 2% YES odds. The extreme skepticism suggests the market expects either that OpenAI's next model will reach this benchmark but fall short of 1520, or that if it does exceed this score, the announcement will come after the June 30 resolution deadline. This low conviction indicates traders believe the benchmark threshold is ambitious relative to typical model improvement trajectories.
OpenAI's model release history shows consistent performance improvements from GPT-3.5 through GPT-4 and its variants, with each generation pushing higher on academic benchmarks like MMLU, GPQA, and math reasoning tests. The company typically releases benchmark data in technical papers and official announcements, allowing public verification of capability claims. A score of 1520 sets a specific, measurable bar—higher than typical intermediate model releases but potentially achievable for a major flagship model if it represents a substantial capability leap. Historical patterns show OpenAI often releases incremental improvements before major generational shifts, meaning traders may expect the next release to be iterative rather than transformative. Competing labs including Anthropic and Google DeepMind continue advancing their own models, which could influence whether 1520 becomes a realistic target. The 2% odds reflect strong consensus that this specific threshold is unlikely to be met, possibly because traders expect diminishing returns in benchmark improvements or believe the benchmark itself may shift before resolution. The extremely narrow conviction here—near 98% NO—suggests either the threshold is set well above expected performance, the benchmark definition is ambiguous, or traders have low confidence that OpenAI will publish a verifiable score before the deadline.
Market resolves YES if OpenAI officially releases a model by June 30, 2026 with documented benchmark score of at least 1520. Resolves NO if no model is released with a verified score at or above this threshold by the deadline.
Polymarket Trade is an independent third-party interface to the Polymarket CLOB prediction market exchange on Polygon — not affiliated with Polymarket, Inc. Prediction markets aggregate trader expectations into real-time probability estimates. Every market question resolves YES or NO based on a specific event outcome; traders buy shares of the side they believe will resolve positively. Prices range 0¢ (certain no) to 100¢ (certain yes) and naturally reflect the crowd-implied probability of YES. Polymarket Trade is non-custodial — your funds never leave your wallet. Open the full interactive page linked above to place orders, see order book depth, and execute a trade.