Google's Gemini 3.1 Pro Preview leads Humanity's Last Exam—a rigorous 2,500-question benchmark probing frontier AI reasoning in math, sciences, and humanities—with 44-46% accuracy using "thinking high" modes, outpacing OpenAI's GPT-5.4 at 41-44% per recent leaderboards like Artificial Analysis and Scale Labs. February's Gemini 3 Deep Think upgrade hit a then-record 48.4%, but March 2026 releases of Gemini 3.1 Flash variants boosted multimodal capabilities amid intensifying competition. Meta's Muse Spark, launched April 8, surged ahead at 50%+, highlighting agentic orchestration gains. With Google I/O looming in May, traders watch for Gemini 4 previews or scaling advances that could push scores higher by June 30 amid volatile benchmark races.
Experimental AI-generated summary referencing Polymarket data. This is not trading advice and plays no role in how this market resolves. · Updated$294,886 Vol.
50%+
39%
55%+
19%
60%+
9%
$294,886 Vol.
50%+
39%
55%+
19%
60%+
9%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Market Opened: Jan 29, 2026, 12:50 PM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Google's Gemini 3.1 Pro Preview leads Humanity's Last Exam—a rigorous 2,500-question benchmark probing frontier AI reasoning in math, sciences, and humanities—with 44-46% accuracy using "thinking high" modes, outpacing OpenAI's GPT-5.4 at 41-44% per recent leaderboards like Artificial Analysis and Scale Labs. February's Gemini 3 Deep Think upgrade hit a then-record 48.4%, but March 2026 releases of Gemini 3.1 Flash variants boosted multimodal capabilities amid intensifying competition. Meta's Muse Spark, launched April 8, surged ahead at 50%+, highlighting agentic orchestration gains. With Google I/O looming in May, traders watch for Gemini 4 previews or scaling advances that could push scores higher by June 30 amid volatile benchmark races.
Experimental AI-generated summary referencing Polymarket data. This is not trading advice and plays no role in how this market resolves. · Updated
Beware of external links.
Beware of external links.
Frequently Asked Questions