OpenAI's o1-preview and o1-mini models lead math benchmarks like the MATH dataset (94.8% accuracy) and AIME 2024 (83% solve rate for preview), anchoring their 92.5% market-implied odds as the best AI for math by March 31. This dominance reflects o1's superior chain-of-thought reasoning, far ahead of Anthropic's Claude 3.5 Sonnet (71% on MATH), DeepSeek's math specialists, and others, with no rival matching demonstrated capabilities. Trader sentiment hinges on OpenAI's iteration speed, but realistic challenges include pre-deadline launches like xAI's Grok-3, Google's Gemini 2.0 updates, or DeepSeek's next-gen releases, potentially shifting leaderboards if they exceed current thresholds.
Experimental AI-generated summary referencing Polymarket data · UpdatedOpenAI 93%
DeepSeek 2.8%
Anthropic 2.4%
xAI 1.9%
$184,706 Vol.
$184,706 Vol.

OpenAI
93%

DeepSeek
3%

Anthropic
2%

xAI
2%

1%

Z.ai
<1%

Alibaba
<1%

Moonshot
<1%

Mistral
<1%
OpenAI 93%
DeepSeek 2.8%
Anthropic 2.4%
xAI 1.9%
$184,706 Vol.
$184,706 Vol.

OpenAI
93%

DeepSeek
3%

Anthropic
2%

xAI
2%

1%

Z.ai
<1%

Alibaba
<1%

Moonshot
<1%

Mistral
<1%
If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Market Opened: Dec 12, 2025, 1:25 PM ET
Resolver
0x2F5e3684c...Resolver
0x2F5e3684c...OpenAI's o1-preview and o1-mini models lead math benchmarks like the MATH dataset (94.8% accuracy) and AIME 2024 (83% solve rate for preview), anchoring their 92.5% market-implied odds as the best AI for math by March 31. This dominance reflects o1's superior chain-of-thought reasoning, far ahead of Anthropic's Claude 3.5 Sonnet (71% on MATH), DeepSeek's math specialists, and others, with no rival matching demonstrated capabilities. Trader sentiment hinges on OpenAI's iteration speed, but realistic challenges include pre-deadline launches like xAI's Grok-3, Google's Gemini 2.0 updates, or DeepSeek's next-gen releases, potentially shifting leaderboards if they exceed current thresholds.
Experimental AI-generated summary referencing Polymarket data · Updated


Beware of external links.
Beware of external links.
Frequently Asked Questions