OpenAI's latest frontier model, GPT-5.4 released in early March 2026, scores around 40-44% on Humanity's Last Exam—a rigorous benchmark of 2,500 expert-level questions spanning math, science, and humanities designed to test beyond saturated evaluations—trailing Google's Gemini 3.1 Pro Preview at approximately 45%. Trader consensus reflects caution amid rapid iteration, with recent variants like GPT-5.4-Cyber and GPT-Rosalind (April 16) showing specialized gains but no broad leap toward the 50% threshold by June 30. Competitive pressure from Anthropic's Claude Opus 4.6 and new hard benchmarks like ARC-AGI-3 underscore scaling limits, while upcoming releases could shift odds if OpenAI demonstrates enhanced reasoning capabilities before the deadline.
Résumé expérimental généré par IA à partir des données Polymarket. Ceci n'est pas un conseil de trading et ne joue aucun rôle dans la résolution de ce marché. · Mis à jour$15,078 Vol.
50 %+
53%
$15,078 Vol.
50 %+
53%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Marché ouvert : Jan 30, 2026, 12:00 AM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...OpenAI's latest frontier model, GPT-5.4 released in early March 2026, scores around 40-44% on Humanity's Last Exam—a rigorous benchmark of 2,500 expert-level questions spanning math, science, and humanities designed to test beyond saturated evaluations—trailing Google's Gemini 3.1 Pro Preview at approximately 45%. Trader consensus reflects caution amid rapid iteration, with recent variants like GPT-5.4-Cyber and GPT-Rosalind (April 16) showing specialized gains but no broad leap toward the 50% threshold by June 30. Competitive pressure from Anthropic's Claude Opus 4.6 and new hard benchmarks like ARC-AGI-3 underscore scaling limits, while upcoming releases could shift odds if OpenAI demonstrates enhanced reasoning capabilities before the deadline.
Résumé expérimental généré par IA à partir des données Polymarket. Ceci n'est pas un conseil de trading et ne joue aucun rôle dans la résolution de ce marché. · Mis à jour
Méfiez-vous des liens externes.
Méfiez-vous des liens externes.
Questions fréquentes