OpenAI's o1 reasoning models dominate trader consensus with a 98% implied probability of leading math benchmarks by March 31, driven by their unprecedented scores—such as 94.8% on the MATH dataset and 83.3% on AIME 2024—far surpassing rivals like Google's Gemini and Anthropic's Claude 3.5 Sonnet. Released in September 2024, o1's chain-of-thought approach has redefined artificial intelligence capabilities in complex problem-solving, with no competitive model release matching it in the past month. This skin-in-the-game positioning reflects aggregated trader assessment of OpenAI's edge in the large language model race. Challenges could arise from surprise announcements, like an upgraded Claude or xAI Grok iteration before the evaluation snapshot, though timelines and benchmark gaps make this unlikely absent major breakthroughs.
Resumen experimental generado por IA con datos de Polymarket · ActualizadoOpenAI 98.1%
Google <1%
Anthropic <1%
DeepSeek <1%
$411,327 Vol.
$411,327 Vol.

OpenAI
98%

1%

Anthropic
1%

DeepSeek
<1%

xAI
<1%

Z.ai
<1%

Alibaba
<1%

Moonshot
<1%

Mistral
<1%
OpenAI 98.1%
Google <1%
Anthropic <1%
DeepSeek <1%
$411,327 Vol.
$411,327 Vol.

OpenAI
98%

1%

Anthropic
1%

DeepSeek
<1%

xAI
<1%

Z.ai
<1%

Alibaba
<1%

Moonshot
<1%

Mistral
<1%
If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Mercado abierto: Dec 12, 2025, 1:25 PM ET
Resolver
0x2F5e3684c...Resolver
0x2F5e3684c...OpenAI's o1 reasoning models dominate trader consensus with a 98% implied probability of leading math benchmarks by March 31, driven by their unprecedented scores—such as 94.8% on the MATH dataset and 83.3% on AIME 2024—far surpassing rivals like Google's Gemini and Anthropic's Claude 3.5 Sonnet. Released in September 2024, o1's chain-of-thought approach has redefined artificial intelligence capabilities in complex problem-solving, with no competitive model release matching it in the past month. This skin-in-the-game positioning reflects aggregated trader assessment of OpenAI's edge in the large language model race. Challenges could arise from surprise announcements, like an upgraded Claude or xAI Grok iteration before the evaluation snapshot, though timelines and benchmark gaps make this unlikely absent major breakthroughs.
Resumen experimental generado por IA con datos de Polymarket · Actualizado
Cuidado con los enlaces externos.
Cuidado con los enlaces externos.
Preguntas frecuentes