Google's Gemini 3.1 Pro, released in February 2026, currently leads the Humanity's Last Exam leaderboard with 44.7% accuracy on its 2,500 expert-level questions spanning math, sciences, and humanities, edging OpenAI's GPT-5.4 at 41.6% and marking a leap from Gemini 3 Pro's 37.5% six months earlier. This gain stems from enhanced multimodal reasoning in Google's large language model, as confirmed in DeepMind's no-tools evaluation at 44.4%. Trader sentiment hinges on scaling trends amid fierce rivalry with Anthropic's Claude Opus 4.6; watch Google I/O in May for Gemini 4 previews or updates that could push scores toward 50% before June 30 resolution, though benchmark saturation remains uncertain.
Experimentelle KI-generierte Zusammenfassung mit Polymarket-Daten · AktualisiertGoogle Gemini-Punktzahl bei der letzten Prüfung der Menschheit bis zum 30. Juni?
Google Gemini-Punktzahl bei der letzten Prüfung der Menschheit bis zum 30. Juni?
$202,880 Vol.
40 %+
95%
45 %+
83%
50 %+
39%
55 %+
16%
60 %+
10%
$202,880 Vol.
40 %+
95%
45 %+
83%
50 %+
39%
55 %+
16%
60 %+
10%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Markt eröffnet: Jan 29, 2026, 12:50 PM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Google's Gemini 3.1 Pro, released in February 2026, currently leads the Humanity's Last Exam leaderboard with 44.7% accuracy on its 2,500 expert-level questions spanning math, sciences, and humanities, edging OpenAI's GPT-5.4 at 41.6% and marking a leap from Gemini 3 Pro's 37.5% six months earlier. This gain stems from enhanced multimodal reasoning in Google's large language model, as confirmed in DeepMind's no-tools evaluation at 44.4%. Trader sentiment hinges on scaling trends amid fierce rivalry with Anthropic's Claude Opus 4.6; watch Google I/O in May for Gemini 4 previews or updates that could push scores toward 50% before June 30 resolution, though benchmark saturation remains uncertain.
Experimentelle KI-generierte Zusammenfassung mit Polymarket-Daten · Aktualisiert
Vorsicht bei externen Links.
Vorsicht bei externen Links.
Häufig gestellte Fragen