xAI's Grok models currently lag on Epoch AI's FrontierMath benchmark, a rigorous test of advanced mathematical reasoning with research-level problems; the latest evaluated Grok-4 scores just 2% on Tier 4, trailing OpenAI's GPT-5.4 Pro at 37.5% and Claude Opus 4.6 around 21%, per Epoch's January 2026 leaderboard. Recent xAI releases like Grok 4.20 and Grok 4 Fast have dominated agentic and coding benchmarks—such as 89% on LiveCodeBench and #1 on BridgeBench multi-agent tasks—but show no updated FrontierMath results amid rapid iteration. Elon Musk announced April 8 that Colossus 2 trains up to 10-trillion-parameter models, positioning xAI to potentially leapfrog competitors before June 30 resolution, though product timelines often slip and pure math gains trail agentic advances. Traders watch for Grok-5 previews or independent evals as key catalysts.
Экспериментальная сводка, созданная ИИ на основе данных Polymarket. Это не является торговой рекомендацией и не влияет на то, как разрешается этот рынок. · Обновленооценка xAI Grok по FrontierMath Benchmark к 30 июня?
оценка xAI Grok по FrontierMath Benchmark к 30 июня?
$19,259 Объем
25%+
56%
30%+
53%
40%+
62%
50%+
10%
$19,259 Объем
25%+
56%
30%+
53%
40%+
62%
50%+
10%
This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Открытие рынка: Jan 30, 2026, 12:01 AM ET
Resolver
0x65070BE91...This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Resolver
0x65070BE91...xAI's Grok models currently lag on Epoch AI's FrontierMath benchmark, a rigorous test of advanced mathematical reasoning with research-level problems; the latest evaluated Grok-4 scores just 2% on Tier 4, trailing OpenAI's GPT-5.4 Pro at 37.5% and Claude Opus 4.6 around 21%, per Epoch's January 2026 leaderboard. Recent xAI releases like Grok 4.20 and Grok 4 Fast have dominated agentic and coding benchmarks—such as 89% on LiveCodeBench and #1 on BridgeBench multi-agent tasks—but show no updated FrontierMath results amid rapid iteration. Elon Musk announced April 8 that Colossus 2 trains up to 10-trillion-parameter models, positioning xAI to potentially leapfrog competitors before June 30 resolution, though product timelines often slip and pure math gains trail agentic advances. Traders watch for Grok-5 previews or independent evals as key catalysts.
Экспериментальная сводка, созданная ИИ на основе данных Polymarket. Это не является торговой рекомендацией и не влияет на то, как разрешается этот рынок. · Обновлено
Не доверяй внешним ссылкам.
Не доверяй внешним ссылкам.
Часто задаваемые вопросы