Humanity's Last Exam is a comprehensive AI evaluation designed to benchmark advanced language models on complex reasoning, knowledge, and problem-solving across scientific, mathematical, and humanities domains. It has become a key metric in AI research for assessing frontier model capabilities and represents an important standardized test for the industry. Anthropic's Claude models have shown strong performance on various benchmarks, and this market explores whether Claude will achieve at least 45% accuracy on this challenging test. The current 57% YES odds suggest traders believe the 45% threshold is likely achievable but far from guaranteed. This score level represents solid performance—meaningfully above random guessing but still demanding across multiple difficult knowledge areas. Resolution will be determined by the official Humanity's Last Exam results released by the test creators. The 45% bar sits at a moderate difficulty level, reflecting confidence that an advanced language model can navigate substantive reasoning tasks. Market pricing has remained relatively stable, indicating balanced sentiment among participants. The June 30, 2026 resolution date allows sufficient time for any remaining Claude iterations and official test evaluation to complete, providing clear resolution criteria based on published benchmark results.