OpenAI's latest GPT-5.4 model, released March 5, 2026, scores 41.6% on Humanity's Last Exam—a frontier benchmark of 2,500 expert-level questions across 100+ subjects—trailing Google's Gemini 3.1 Pro Preview at 44.7% per April 21 leaderboards from Artificial Analysis. This marks a sharp rise from sub-10% scores by early 2026 models like GPT-4o, driven by scaling laws and post-training optimizations boosting reasoning on novel academic challenges. With tools, GPT-5.4 Pro hits 58.7%, but standard evals are tool-free. Trader consensus hinges on OpenAI's rapid iteration cycle potentially yielding GPT-5.5 or updates by June 30, amid competition from Anthropic's Claude Opus 4.6 and emerging open models like Moonshot's Kimi K2.6; watch for developer conference announcements or independent eval updates that could shift implied probabilities.
Resumen experimental generado por IA con datos de Polymarket. Esto no es asesoramiento de trading y no influye en cómo se resuelve este mercado. · Actualizado$15,368 Vol.
50%+
61%
$15,368 Vol.
50%+
61%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Mercado abierto: Jan 30, 2026, 12:00 AM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...OpenAI's latest GPT-5.4 model, released March 5, 2026, scores 41.6% on Humanity's Last Exam—a frontier benchmark of 2,500 expert-level questions across 100+ subjects—trailing Google's Gemini 3.1 Pro Preview at 44.7% per April 21 leaderboards from Artificial Analysis. This marks a sharp rise from sub-10% scores by early 2026 models like GPT-4o, driven by scaling laws and post-training optimizations boosting reasoning on novel academic challenges. With tools, GPT-5.4 Pro hits 58.7%, but standard evals are tool-free. Trader consensus hinges on OpenAI's rapid iteration cycle potentially yielding GPT-5.5 or updates by June 30, amid competition from Anthropic's Claude Opus 4.6 and emerging open models like Moonshot's Kimi K2.6; watch for developer conference announcements or independent eval updates that could shift implied probabilities.
Resumen experimental generado por IA con datos de Polymarket. Esto no es asesoramiento de trading y no influye en cómo se resuelve este mercado. · Actualizado
Cuidado con los enlaces externos.
Cuidado con los enlaces externos.
Preguntas frecuentes