OpenAI's o1 reasoning models dominate trader sentiment with 93.5% implied probability, driven by their record 94.8% score on the challenging MATH benchmark—nearly doubling prior leaders like Claude 3.5 Sonnet—showcasing superior chain-of-thought capabilities for competition-level math problems. Released in September 2024, o1's reinforcement learning edge has held firm amid leaderboards from Hugging Face and Epoch AI, with no rival announcements credibly threatening supremacy by March 31. Challenges could arise from surprise launches, such as xAI's Grok-3 (targeted for late 2024) or Google's Gemini 2.0, if they demonstrate verifiable math outperformance on standardized evals like GSM8K or AIME. Absent such catalysts, trader consensus reflects o1's entrenched lead.
Experimentelle KI-generierte Zusammenfassung mit Polymarket-Daten · AktualisiertOpenAI 94%
xAI 1.7%
DeepSeek 1.5%
Google 1.0%
$145,419 Vol.
$145,419 Vol.

OpenAI
94%

xAI
2%

DeepSeek
2%

1%

Anthropic
1%

Moonshot
1%

Z.ai
<1%

Alibaba
<1%

Mistral
<1%
OpenAI 94%
xAI 1.7%
DeepSeek 1.5%
Google 1.0%
$145,419 Vol.
$145,419 Vol.

OpenAI
94%

xAI
2%

DeepSeek
2%

1%

Anthropic
1%

Moonshot
1%

Z.ai
<1%

Alibaba
<1%

Mistral
<1%
If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Markt eröffnet: Dec 12, 2025, 1:25 PM ET
Resolver
0x2F5e3684c...Resolver
0x2F5e3684c...OpenAI's o1 reasoning models dominate trader sentiment with 93.5% implied probability, driven by their record 94.8% score on the challenging MATH benchmark—nearly doubling prior leaders like Claude 3.5 Sonnet—showcasing superior chain-of-thought capabilities for competition-level math problems. Released in September 2024, o1's reinforcement learning edge has held firm amid leaderboards from Hugging Face and Epoch AI, with no rival announcements credibly threatening supremacy by March 31. Challenges could arise from surprise launches, such as xAI's Grok-3 (targeted for late 2024) or Google's Gemini 2.0, if they demonstrate verifiable math outperformance on standardized evals like GSM8K or AIME. Absent such catalysts, trader consensus reflects o1's entrenched lead.
Experimentelle KI-generierte Zusammenfassung mit Polymarket-Daten · Aktualisiert
Vorsicht bei externen Links.
Vorsicht bei externen Links.
Häufig gestellte Fragen