Trader consensus on Polymarket reflects skepticism toward Google Gemini achieving a breakthrough score on Humanity’s Last Exam by June 30, driven by the benchmark’s extreme difficulty and current leaderboard standings. Launched December 3, 2024, by Scale AI and the Center for AI Safety, the exam features 2,500 frontier-level questions where top models like OpenAI’s o1-preview score just 9.1% and Gemini 2.0 Pro Experimental hits 7.4%, despite Google’s Gemini 2.0 release on December 11 showcasing multimodal advances. No verified progress on HLE since then underscores scaling challenges in reasoning and knowledge synthesis. Key catalysts ahead include Google I/O in May for potential Gemini 2.5 announcements and competitive pressure from Anthropic or xAI model updates, though historical timelines suggest ambitious leaps remain uncertain.
Polymarketデータを参照したAI生成の実験的な要約 · 更新日$132,621 Vol.
40%以上
98%
45%以上
83%
50%以上
40%
55%以上
16%
60%以上
11%
$132,621 Vol.
40%以上
98%
45%以上
83%
50%以上
40%
55%以上
16%
60%以上
11%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
マーケット開始日: Jan 29, 2026, 12:50 PM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Trader consensus on Polymarket reflects skepticism toward Google Gemini achieving a breakthrough score on Humanity’s Last Exam by June 30, driven by the benchmark’s extreme difficulty and current leaderboard standings. Launched December 3, 2024, by Scale AI and the Center for AI Safety, the exam features 2,500 frontier-level questions where top models like OpenAI’s o1-preview score just 9.1% and Gemini 2.0 Pro Experimental hits 7.4%, despite Google’s Gemini 2.0 release on December 11 showcasing multimodal advances. No verified progress on HLE since then underscores scaling challenges in reasoning and knowledge synthesis. Key catalysts ahead include Google I/O in May for potential Gemini 2.5 announcements and competitive pressure from Anthropic or xAI model updates, though historical timelines suggest ambitious leaps remain uncertain.
Polymarketデータを参照したAI生成の実験的な要約 · 更新日
外部リンクに注意してください。
外部リンクに注意してください。
よくある質問