OpenAI's latest GPT-5.4 model, released in March 2026, scores 41.6-44.3% on Humanity's Last Exam—a 2,500-question frontier benchmark testing PhD-level reasoning across math, sciences, and humanities—marking an 8-point jump from GPT-5.2's mid-30s without tools, yet falling short of the 50% threshold. This progress reflects aggressive scaling laws, with frontier large language models gaining 30 percentage points annually per the Stanford AI Index, but trails Google's Gemini 3.1 Pro Preview at 44.7% and Anthropic's Claude Opus 4.6 nearing 50% in thinking modes. Traders eye potential GPT-5.5 previews or enhanced reasoning chains before June 30, amid benchmark saturation risks and competitive leaks, while tool-assisted evals already top 58%.
Experimentelle KI-generierte Zusammenfassung mit Polymarket-Daten. Dies ist keine Handelsberatung und spielt keine Rolle bei der Auflösung dieses Marktes. · Aktualisiert$15,078 Vol.
50 %+
53%
$15,078 Vol.
50 %+
53%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Markt eröffnet: Jan 30, 2026, 12:00 AM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...OpenAI's latest GPT-5.4 model, released in March 2026, scores 41.6-44.3% on Humanity's Last Exam—a 2,500-question frontier benchmark testing PhD-level reasoning across math, sciences, and humanities—marking an 8-point jump from GPT-5.2's mid-30s without tools, yet falling short of the 50% threshold. This progress reflects aggressive scaling laws, with frontier large language models gaining 30 percentage points annually per the Stanford AI Index, but trails Google's Gemini 3.1 Pro Preview at 44.7% and Anthropic's Claude Opus 4.6 nearing 50% in thinking modes. Traders eye potential GPT-5.5 previews or enhanced reasoning chains before June 30, amid benchmark saturation risks and competitive leaks, while tool-assisted evals already top 58%.
Experimentelle KI-generierte Zusammenfassung mit Polymarket-Daten. Dies ist keine Handelsberatung und spielt keine Rolle bei der Auflösung dieses Marktes. · Aktualisiert
Vorsicht bei externen Links.
Vorsicht bei externen Links.
Häufig gestellte Fragen