OpenAI's GPT-5.4 Pro set a new FrontierMath record in early March 2026, achieving 38% on Tier 4—the benchmark's hardest research-level math problems—surpassing its own GPT-5.2 Pro's 31% from January and prior state-of-the-art scores below 20%. This leap reflects rapid scaling in large language model reasoning, with agents leveraging GPT-5.4 settling open FrontierMath problems, including elegant proofs of Erdős conjectures. OpenAI leads competitors like Anthropic's Claude 4.6 and xAI's Grok 4.2, which trail on this metric. Traders eye iterative releases like potential GPT-5.5 by June 30, alongside Epoch AI updates, as sustained progress could push scores toward 50% amid benchmark saturation risks.
Experimental AI-generated summary referencing Polymarket data · Updated$18,450 Vol.
60%+
49%
70%+
17%
$18,450 Vol.
60%+
49%
70%+
17%
This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Market Opened: Jan 29, 2026, 12:47 PM ET
Resolver
0x65070BE91...This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Resolver
0x65070BE91...OpenAI's GPT-5.4 Pro set a new FrontierMath record in early March 2026, achieving 38% on Tier 4—the benchmark's hardest research-level math problems—surpassing its own GPT-5.2 Pro's 31% from January and prior state-of-the-art scores below 20%. This leap reflects rapid scaling in large language model reasoning, with agents leveraging GPT-5.4 settling open FrontierMath problems, including elegant proofs of Erdős conjectures. OpenAI leads competitors like Anthropic's Claude 4.6 and xAI's Grok 4.2, which trail on this metric. Traders eye iterative releases like potential GPT-5.5 by June 30, alongside Epoch AI updates, as sustained progress could push scores toward 50% amid benchmark saturation risks.
Experimental AI-generated summary referencing Polymarket data · Updated


Beware of external links.
Beware of external links.
Frequently Asked Questions