OpenAI's o1 reasoning model dominates math benchmarks, anchoring its 98.1% market-implied probability as the top AI for math by March 31. Achieving 94.1% on the challenging MATH dataset and 83.3% on AIME 2024—far surpassing Google's Gemini 1.5 Pro (67.7% MATH) and Anthropic's Claude 3.5 Sonnet (around 60%)—o1's chain-of-thought capabilities have held firm since its September 2024 launch, with no rival releases closing the gap in recent months. Trader consensus, backed by real capital, reflects OpenAI's edge in scaling reasoning-focused large language models amid a quiet competitive landscape. Potential challengers include an early Gemini 2.0 drop, xAI's Grok-3 preview, or benchmark shifts, but execution risks and timelines make upsets unlikely absent major announcements.
基于Polymarket数据的AI实验性摘要 · 更新于OpenAI 98.0%
谷歌 <1%
Anthropic <1%
DeepSeek <1%
$411,328 交易量
$411,328 交易量

OpenAI
98%

谷歌
1%

Anthropic
1%

DeepSeek
<1%

xAI
<1%

Z.ai
<1%

阿里巴巴
<1%

Moonshot
<1%

Mistral
<1%
OpenAI 98.0%
谷歌 <1%
Anthropic <1%
DeepSeek <1%
$411,328 交易量
$411,328 交易量

OpenAI
98%

谷歌
1%

Anthropic
1%

DeepSeek
<1%

xAI
<1%

Z.ai
<1%

阿里巴巴
<1%

Moonshot
<1%

Mistral
<1%
If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
市场开放时间: Dec 12, 2025, 1:25 PM ET
Resolver
0x2F5e3684c...Resolver
0x2F5e3684c...OpenAI's o1 reasoning model dominates math benchmarks, anchoring its 98.1% market-implied probability as the top AI for math by March 31. Achieving 94.1% on the challenging MATH dataset and 83.3% on AIME 2024—far surpassing Google's Gemini 1.5 Pro (67.7% MATH) and Anthropic's Claude 3.5 Sonnet (around 60%)—o1's chain-of-thought capabilities have held firm since its September 2024 launch, with no rival releases closing the gap in recent months. Trader consensus, backed by real capital, reflects OpenAI's edge in scaling reasoning-focused large language models amid a quiet competitive landscape. Potential challengers include an early Gemini 2.0 drop, xAI's Grok-3 preview, or benchmark shifts, but execution risks and timelines make upsets unlikely absent major announcements.
基于Polymarket数据的AI实验性摘要 · 更新于
警惕外部链接哦。
警惕外部链接哦。
常见问题