Trader sentiment on Google Gemini's performance on Humanity's Last Exam—a rigorous, crowdsourced benchmark of 2,000+ frontier-hard questions testing AI reasoning across sciences—leans bearish, with implied probabilities hovering low due to stagnant scores and no imminent upgrades. Gemini 1.5 Pro currently scores around 7% accuracy per Epoch AI leaderboards, trailing leaders like Anthropic's Claude 3.5 Sonnet (9%) and OpenAI's o1-preview (10%), highlighting competitive pressures in multimodal reasoning. Google I/O announcements focused on Gemini 1.5 Flash efficiency rather than benchmark breakthroughs, and Gemini 2.0 remains slated for late 2024, post-June 30 deadline. Traders eye unannounced evals or surprise drops, but historical slippage in AI timelines tempers optimism.
Experimentelle KI-generierte Zusammenfassung mit Polymarket-Daten · AktualisiertGoogle Gemini-Punktzahl bei der letzten Prüfung der Menschheit bis zum 30. Juni?
Google Gemini-Punktzahl bei der letzten Prüfung der Menschheit bis zum 30. Juni?
40 %+
97%
45 %+
81%
50 %+
40%
55 %+
17%
60 %+
10%
$0.00 Vol.
40 %+
97%
45 %+
81%
50 %+
40%
55 %+
17%
60 %+
10%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Markt eröffnet: Jan 29, 2026, 12:50 PM ET
Resolver
0x65070BE91...Resolver
0x65070BE91...Trader sentiment on Google Gemini's performance on Humanity's Last Exam—a rigorous, crowdsourced benchmark of 2,000+ frontier-hard questions testing AI reasoning across sciences—leans bearish, with implied probabilities hovering low due to stagnant scores and no imminent upgrades. Gemini 1.5 Pro currently scores around 7% accuracy per Epoch AI leaderboards, trailing leaders like Anthropic's Claude 3.5 Sonnet (9%) and OpenAI's o1-preview (10%), highlighting competitive pressures in multimodal reasoning. Google I/O announcements focused on Gemini 1.5 Flash efficiency rather than benchmark breakthroughs, and Gemini 2.0 remains slated for late 2024, post-June 30 deadline. Traders eye unannounced evals or surprise drops, but historical slippage in AI timelines tempers optimism.
Experimentelle KI-generierte Zusammenfassung mit Polymarket-Daten · Aktualisiert
Vorsicht bei externen Links.
Vorsicht bei externen Links.
Häufig gestellte Fragen