OpenAI's o1 reasoning models dominate trader sentiment with a 91.5% implied probability of having the best AI model for math by March 31, driven by their record-breaking scores on key benchmarks like the MATH dataset (94.8%) and AIME 2024 (83.3%), far surpassing rivals via advanced chain-of-thought techniques. No confirmed challenger has matched these verified leaderboard results, per Hugging Face and LMSYS arenas, amid a lull in major releases. Scenarios that could shift odds include surprise launches like xAI's Grok-3 (slated for early 2025) or Anthropic's Claude 4 preview exceeding o1's thresholds before deadline, though historical delays temper such bets.
Experimental AI-generated summary referencing Polymarket data · UpdatedOpenAI 92%
Anthropic 2.4%
DeepSeek 2.3%
xAI 1.8%
$318,387 Vol.
$318,387 Vol.

OpenAI
92%

Anthropic
2%

DeepSeek
2%

xAI
2%

1%

Z.ai
<1%

Alibaba
<1%

Moonshot
<1%

Mistral
<1%
OpenAI 92%
Anthropic 2.4%
DeepSeek 2.3%
xAI 1.8%
$318,387 Vol.
$318,387 Vol.

OpenAI
92%

Anthropic
2%

DeepSeek
2%

xAI
2%

1%

Z.ai
<1%

Alibaba
<1%

Moonshot
<1%

Mistral
<1%
If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Market Opened: Dec 12, 2025, 1:25 PM ET
Resolver
0x2F5e3684c...Resolver
0x2F5e3684c...OpenAI's o1 reasoning models dominate trader sentiment with a 91.5% implied probability of having the best AI model for math by March 31, driven by their record-breaking scores on key benchmarks like the MATH dataset (94.8%) and AIME 2024 (83.3%), far surpassing rivals via advanced chain-of-thought techniques. No confirmed challenger has matched these verified leaderboard results, per Hugging Face and LMSYS arenas, amid a lull in major releases. Scenarios that could shift odds include surprise launches like xAI's Grok-3 (slated for early 2025) or Anthropic's Claude 4 preview exceeding o1's thresholds before deadline, though historical delays temper such bets.
Experimental AI-generated summary referencing Polymarket data · Updated


Beware of external links.
Beware of external links.
Frequently Asked Questions