Claude at 13% odds to score 50%+ on Humanity's Last Exam, with $148 24h volume and $3.3K liquidity. Closes June 30. Trade on Polymarket via Polymarket Trade.
Connect wallet to trade · No wallet? Passkey login available · Free alerts at /subscribe
Humanity's Last Exam is a rigorous benchmark designed to test advanced AI reasoning and general knowledge at expert human levels. The market currently prices Claude at only 13% odds to score at least 50% on this comprehensive test by June 30, 2026. This low probability suggests traders view the benchmark as genuinely difficult and believe Claude's current capabilities, even with planned updates, may fall short of the threshold. The pricing reflects skepticism about both the pace of AI capability improvement and the exam's difficulty level. At 13% implied odds, the market is essentially saying there's roughly a 1-in-8 chance Claude reaches 50%—a meaningful but unlikely outcome. This suggests the broader trading community views such performance as contingent on either unexpected reasoning breakthroughs or favorable alignment between the exam's content and Claude's architectural strengths.
Humanity's Last Exam is a rigorous benchmark designed to test whether advanced AI systems possess general reasoning, knowledge, and problem-solving capabilities comparable to or exceeding expert human levels. The test is intentionally comprehensive, spanning multiple knowledge domains and cognitive challenges, making it a high bar for AI evaluation. Unlike benchmarks that test narrow pattern-matching or memorization, Humanity's Last Exam requires demonstrable reasoning—tasks demanding understanding, inference, and synthesis. The market's 13% pricing reflects trader consensus that this is a difficult, meaningful test, not a low hurdle. Several factors could drive the market toward YES. First, Anthropic has demonstrated consistent capability improvements with each Claude release cycle, particularly in reasoning-heavy tasks. If Claude 4.0 or a major update lands before June 30 with substantial gains in multi-step reasoning, chain-of-thought execution, or mathematical problem-solving, it could move materially closer to 50%. Second, the benchmark's structure might inadvertently favor Claude's architectural strengths—deep context windows, instruction-following, or systematic reasoning—more than anticipated. Third, if test creators publish the exam in advance, Anthropic engineers could optimize Claude's approach toward that specific test. Fourth, ensemble or multi-pass strategies might aggregate to strong performance even if single-pass performance is weaker. Against YES, significant headwinds exist. Humanity's Last Exam was explicitly designed to test frontier AI capabilities and identify gaps between current systems and human expertise. A 50% score would imply Claude is competitive with expert humans on a comprehensive test—a claim the industry broadly considers premature as of mid-2026. Second, architectural or computational limitations may constrain Claude's scaling; raw parameter count doesn't automatically translate to reasoning capability. Third, exam creators have strong incentives to maintain rigor and difficulty, resisting "benchmark gaming." Fourth, even if Claude improves, competing systems may improve simultaneously, and the exam is designed to stay ahead of AI progress. The 13% odds imply traders believe such a score requires either unexpected breakthroughs in capabilities, favorable exam-alignment luck, or a reassessment of what "50%" means in context.
The market resolves YES if any Anthropic Claude model scores at least 50% on Humanity's Last Exam by June 30, 2026. Otherwise it resolves NO. Results must be published by the exam creators with transparent grading methodology.
Polymarket Trade is an independent third-party interface to the Polymarket CLOB prediction market exchange on Polygon — not affiliated with Polymarket, Inc. Prediction markets aggregate trader expectations into real-time probability estimates. Every market question resolves YES or NO based on a specific event outcome; traders buy shares of the side they believe will resolve positively. Prices range 0¢ (certain no) to 100¢ (certain yes) and naturally reflect the crowd-implied probability of YES. Polymarket Trade is non-custodial — your funds never leave your wallet. Open the full interactive page linked above to place orders, see order book depth, and execute a trade.