Trader consensus on Polymarket reflects deep skepticism, with implied probabilities under 20% for Anthropic's Claude achieving a meaningful score—likely above 50%—on Humanity’s Last Exam (HLE) by June 30, 2025, given current frontier models' dismal performance. Claude 3.5 Sonnet recently scored just 6.6% on the CAIS/Scale AI benchmark, which features 2,500 expert-level questions across STEM and humanities designed to stump near-term AIs. Primary drivers include HLE's intentional hardness amid scaling plateaus, Anthropic's cautious release cadence (Claude 4 teased for late 2024 but unconfirmed), and fierce competition from OpenAI's o1 series (8.2%) and upcoming GPT-5. Watch for Anthropic's developer updates or AI safety summits, as capability jumps remain speculative despite compute investments.
Resumo experimental gerado por IA com dados do Polymarket · AtualizadoPontuação antrópica de Claude no último exame da humanidade até 30 de junho?
Pontuação antrópica de Claude no último exame da humanidade até 30 de junho?
$176,045 Vol.
35%+
92%
45%+
36%
$176,045 Vol.
35%+
92%
45%+
36%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Mercado Aberto: Jan 30, 2026, 12:00 AM ET
Resolver
0x65070BE91...Resolver
0x65070BE91...Trader consensus on Polymarket reflects deep skepticism, with implied probabilities under 20% for Anthropic's Claude achieving a meaningful score—likely above 50%—on Humanity’s Last Exam (HLE) by June 30, 2025, given current frontier models' dismal performance. Claude 3.5 Sonnet recently scored just 6.6% on the CAIS/Scale AI benchmark, which features 2,500 expert-level questions across STEM and humanities designed to stump near-term AIs. Primary drivers include HLE's intentional hardness amid scaling plateaus, Anthropic's cautious release cadence (Claude 4 teased for late 2024 but unconfirmed), and fierce competition from OpenAI's o1 series (8.2%) and upcoming GPT-5. Watch for Anthropic's developer updates or AI safety summits, as capability jumps remain speculative despite compute investments.
Resumo experimental gerado por IA com dados do Polymarket · Atualizado
Cuidado com os links externos.
Cuidado com os links externos.
Frequently Asked Questions