Google's Gemini 3.1 Pro Preview currently leads the Scale Labs Humanity’s Last Exam leaderboard—the market resolution source—with a 46.4% score without tools as of early May 2026, driving trader optimism for crossing 50% by June 30 amid rapid model iterations. This builds on Gemini 3 Deep Think's February benchmark of 48.4%, fueled by Google's March 2026 releases like Gemini 3.1 Flash and the May 7 Gemini 3.1 Flash-Lite, emphasizing enhanced reasoning and efficiency. Competitive pressure from Anthropic's Claude Mythos Preview (64.7% on some boards) and OpenAI's GPT-5.5 series accelerates progress, with Stanford's AI Index 2026 noting frontier models hitting 50%+. Upcoming previews or Gemini 4 could catalyze shifts, though evaluation lags and contamination risks add uncertainty.
基于Polymarket数据的AI实验性摘要。这不是交易建议,也不影响该市场的结算方式。 · 更新于$312,011 交易量
50%及以上
54%
55%及以上
35%
60%以上
7%
$312,011 交易量
50%及以上
54%
55%及以上
35%
60%以上
7%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
市场开放时间: Jan 29, 2026, 12:50 PM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Google's Gemini 3.1 Pro Preview currently leads the Scale Labs Humanity’s Last Exam leaderboard—the market resolution source—with a 46.4% score without tools as of early May 2026, driving trader optimism for crossing 50% by June 30 amid rapid model iterations. This builds on Gemini 3 Deep Think's February benchmark of 48.4%, fueled by Google's March 2026 releases like Gemini 3.1 Flash and the May 7 Gemini 3.1 Flash-Lite, emphasizing enhanced reasoning and efficiency. Competitive pressure from Anthropic's Claude Mythos Preview (64.7% on some boards) and OpenAI's GPT-5.5 series accelerates progress, with Stanford's AI Index 2026 noting frontier models hitting 50%+. Upcoming previews or Gemini 4 could catalyze shifts, though evaluation lags and contamination risks add uncertainty.
基于Polymarket数据的AI实验性摘要。这不是交易建议,也不影响该市场的结算方式。 · 更新于
警惕外部链接哦。
警惕外部链接哦。
常见问题