Trader consensus on Polymarket reflects Anthropic's Claude models trailing frontier benchmarks on Humanity's Last Exam, a 2,500-question multi-modal test probing expert-level knowledge across math, physics, biology, and more, where top scores hover below 45%—far from human expert performance. Claude Opus 4.6 (with chain-of-thought reasoning) achieved 34.44% on Scale AI's leaderboard as of early 2026, ranking fourth behind Gemini 3.1 Pro Preview (45%) and GPT-5.4 variants, bolstered by Anthropic's February Opus 4.6 release and March's flurry of updates enhancing reasoning and tool use. Competitive pressures from OpenAI and Google intensify, but Claude 5's anticipated Q2 launch could deliver capability leaps before June 30, with resolution tied to official evals of the latest model iteration.
Resumo experimental gerado por IA com dados do Polymarket · AtualizadoPontuação antrópica de Claude no último exame da humanidade até 30 de junho?
Pontuação antrópica de Claude no último exame da humanidade até 30 de junho?
$191,625 Vol.
35%+
96%
45%+
65%
$191,625 Vol.
35%+
96%
45%+
65%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Mercado Aberto: Jan 30, 2026, 12:00 AM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Trader consensus on Polymarket reflects Anthropic's Claude models trailing frontier benchmarks on Humanity's Last Exam, a 2,500-question multi-modal test probing expert-level knowledge across math, physics, biology, and more, where top scores hover below 45%—far from human expert performance. Claude Opus 4.6 (with chain-of-thought reasoning) achieved 34.44% on Scale AI's leaderboard as of early 2026, ranking fourth behind Gemini 3.1 Pro Preview (45%) and GPT-5.4 variants, bolstered by Anthropic's February Opus 4.6 release and March's flurry of updates enhancing reasoning and tool use. Competitive pressures from OpenAI and Google intensify, but Claude 5's anticipated Q2 launch could deliver capability leaps before June 30, with resolution tied to official evals of the latest model iteration.
Resumo experimental gerado por IA com dados do Polymarket · Atualizado
Cuidado com os links externos.
Cuidado com os links externos.
Frequently Asked Questions