Trader consensus on xAI's Grok achieving a competitive score on the FrontierMath benchmark—Epoch AI's test of hundreds of unpublished, expert-level math problems including open research challenges—reflects its historical lag behind leaders like OpenAI's GPT-5.4, which hit 47.6% overall and 38% on Tier 4 as of March 2026. Grok 4's last independent evaluation in July 2025 yielded just 12-14% on Tiers 1-3, with no public FrontierMath results for later versions like Grok 4.1 despite xAI's focus on reasoning agents. Rapid competitive advances underscore the gap, but a potential Grok 5 release before June 30 could shift dynamics, as xAI accelerates multimodal and reasoning capabilities amid intensifying AI lab rivalries.
基於Polymarket數據的AI實驗性摘要 · 更新於$15,883 交易量
25%+
73%
30%+
69%
40%+
57%
50%以上
28%
$15,883 交易量
25%+
73%
30%+
69%
40%+
57%
50%以上
28%
This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
市場開放時間: Jan 30, 2026, 12:01 AM ET
Resolver
0x65070BE91...This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Resolver
0x65070BE91...Trader consensus on xAI's Grok achieving a competitive score on the FrontierMath benchmark—Epoch AI's test of hundreds of unpublished, expert-level math problems including open research challenges—reflects its historical lag behind leaders like OpenAI's GPT-5.4, which hit 47.6% overall and 38% on Tier 4 as of March 2026. Grok 4's last independent evaluation in July 2025 yielded just 12-14% on Tiers 1-3, with no public FrontierMath results for later versions like Grok 4.1 despite xAI's focus on reasoning agents. Rapid competitive advances underscore the gap, but a potential Grok 5 release before June 30 could shift dynamics, as xAI accelerates multimodal and reasoning capabilities amid intensifying AI lab rivalries.
基於Polymarket數據的AI實驗性摘要 · 更新於
警惕外部連結哦。
警惕外部連結哦。
Frequently Asked Questions