OpenAI's o1 reasoning models have driven trader consensus to a 98.8% implied probability of leading math benchmarks on March 31, thanks to their September release achieving state-of-the-art scores—94.8% on the challenging MATH dataset and high marks on AIME problems—via advanced chain-of-thought capabilities that outpace rivals like Anthropic's Claude 3.5 Sonnet and Google's Gemini 1.5 Pro. No competitor has demonstrated superior general math performance in recent benchmarks or official evals, solidifying OpenAI's positioning amid a quiet period for AI releases. Challenges could arise from surprise launches, such as Google's anticipated Gemini 2.0 or an upgraded DeepSeek model, before the deadline, though historical timelines suggest such shifts are rare without prior teasers.
Experimentelle KI-generierte Zusammenfassung mit Polymarket-Daten · AktualisiertOpenAI 98.8%
Google <1%
DeepSeek <1%
Anthropic <1%
$442,821 Vol.
$442,821 Vol.

OpenAI
99%

1%

DeepSeek
<1%

Anthropic
<1%

xAI
<1%

Moonshot
<1%

Z.ai
<1%

Mistral
<1%

Alibaba
<1%
OpenAI 98.8%
Google <1%
DeepSeek <1%
Anthropic <1%
$442,821 Vol.
$442,821 Vol.

OpenAI
99%

1%

DeepSeek
<1%

Anthropic
<1%

xAI
<1%

Moonshot
<1%

Z.ai
<1%

Mistral
<1%

Alibaba
<1%
If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Markt eröffnet: Dec 12, 2025, 1:25 PM ET
Resolver
0x2F5e3684c...If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Resolver
0x2F5e3684c...OpenAI's o1 reasoning models have driven trader consensus to a 98.8% implied probability of leading math benchmarks on March 31, thanks to their September release achieving state-of-the-art scores—94.8% on the challenging MATH dataset and high marks on AIME problems—via advanced chain-of-thought capabilities that outpace rivals like Anthropic's Claude 3.5 Sonnet and Google's Gemini 1.5 Pro. No competitor has demonstrated superior general math performance in recent benchmarks or official evals, solidifying OpenAI's positioning amid a quiet period for AI releases. Challenges could arise from surprise launches, such as Google's anticipated Gemini 2.0 or an upgraded DeepSeek model, before the deadline, though historical timelines suggest such shifts are rare without prior teasers.
Experimentelle KI-generierte Zusammenfassung mit Polymarket-Daten · Aktualisiert
Vorsicht bei externen Links.
Vorsicht bei externen Links.
Häufig gestellte Fragen