Recent releases of advanced Claude models, including Claude Fable 5 on June 9 and Opus 4.8 variants, have driven strong trader sentiment by pushing Anthropic's frontier large language models to leading or near-leading positions on Humanity's Last Exam (HLE), a 2,500-question benchmark testing graduate-level expertise across math, sciences, and humanities. Current top Claude variants post scores in the 53-65% range on public leaderboards, outpacing earlier 2025 results below 10% and competing closely with GPT-5 series and Gemini previews amid rapid iteration cycles. Competitive pressure from rival labs, combined with Anthropic's focus on reasoning depth and agentic capabilities, supports momentum, though HLE's held-out questions and calibration demands introduce uncertainty around last-minute gains before the June 30 cutoff.
Tóm tắt AI thử nghiệm tham chiếu dữ liệu Polymarket. Đây không phải tư vấn giao dịch và không ảnh hưởng đến cách thị trường này được giải quyết. · Cập nhật$341,132 KL.
45%+
61%
50%+
35%
55%+
12%
$341,132 KL.
45%+
61%
50%+
35%
55%+
12%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Thị trường mở: Jan 30, 2026, 12:00 AM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Recent releases of advanced Claude models, including Claude Fable 5 on June 9 and Opus 4.8 variants, have driven strong trader sentiment by pushing Anthropic's frontier large language models to leading or near-leading positions on Humanity's Last Exam (HLE), a 2,500-question benchmark testing graduate-level expertise across math, sciences, and humanities. Current top Claude variants post scores in the 53-65% range on public leaderboards, outpacing earlier 2025 results below 10% and competing closely with GPT-5 series and Gemini previews amid rapid iteration cycles. Competitive pressure from rival labs, combined with Anthropic's focus on reasoning depth and agentic capabilities, supports momentum, though HLE's held-out questions and calibration demands introduce uncertainty around last-minute gains before the June 30 cutoff.
Tóm tắt AI thử nghiệm tham chiếu dữ liệu Polymarket. Đây không phải tư vấn giao dịch và không ảnh hưởng đến cách thị trường này được giải quyết. · Cập nhật
Cẩn thận với liên kết bên ngoài.
Cẩn thận với liên kết bên ngoài.
Câu hỏi thường gặp