Google Gemini's underwhelming performance on the newly launched FrontierMath benchmark—where the latest 2.0 Flash Experimental model scores just 11.3% compared to OpenAI's o1-preview at 25.2%—is the primary driver of cautious trader sentiment, with implied probabilities low for a top-tier result by June 30, 2025. Recent Google DeepMind announcements highlight math reasoning gains via "thinking" modes akin to o1, but lag behind rivals like Anthropic's Claude 3.5 Sonnet (9.7%) persists amid the benchmark's 250 ultra-hard competition-level problems. Competitive pressure intensifies with OpenAI's o1 full release and potential Gemini 2.5 updates; watch Q2 2025 I/O for score reveals or leaderboard submissions that could shift odds.
Experimental AI-generated summary referencing Polymarket data · Updated$38,776 Vol.
40%+
94%
45%+
66%
50%+
26%
60%+
17%
$38,776 Vol.
40%+
94%
45%+
66%
50%+
26%
60%+
17%
This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Market Opened: Feb 6, 2026, 6:03 PM ET
Resolver
0x65070BE91...Resolver
0x65070BE91...Google Gemini's underwhelming performance on the newly launched FrontierMath benchmark—where the latest 2.0 Flash Experimental model scores just 11.3% compared to OpenAI's o1-preview at 25.2%—is the primary driver of cautious trader sentiment, with implied probabilities low for a top-tier result by June 30, 2025. Recent Google DeepMind announcements highlight math reasoning gains via "thinking" modes akin to o1, but lag behind rivals like Anthropic's Claude 3.5 Sonnet (9.7%) persists amid the benchmark's 250 ultra-hard competition-level problems. Competitive pressure intensifies with OpenAI's o1 full release and potential Gemini 2.5 updates; watch Q2 2025 I/O for score reveals or leaderboard submissions that could shift odds.
Experimental AI-generated summary referencing Polymarket data · Updated
Beware of external links.
Beware of external links.
Frequently Asked Questions