OpenAI's o1 reasoning models dominate trader consensus at 93% implied probability for the best AI coding model by March 31, driven by their leading scores on key benchmarks like SWE-Bench Verified (48.9%) and LiveCodeBench, where o1-preview excels in complex, multi-step programming tasks over rivals like Claude 3.5 Sonnet. This edge stems from OpenAI's focus on chain-of-thought reasoning, enabling superior bug-fixing and code generation, bolstered by recent full o1-mini deployment and rumors of iterative updates ahead of the deadline. Supporting factors include OpenAI's rapid release cadence and massive compute resources. Realistic challenges include Anthropic's anticipated Claude 4 launch, Google's Gemini 2.0 advancements at upcoming I/O, or breakthroughs from cost-efficient open-source players like DeepSeek, any of which could shift leaderboards before resolution.
基于Polymarket数据的AI实验性摘要 · 更新于OpenAI 93%
Anthropic 5.3%
谷歌 <1%
DeepSeek <1%
$938,331 交易量
$938,331 交易量

OpenAI
93%

Anthropic
5%

谷歌
1%

DeepSeek
<1%

xAI
<1%

Z.ai
<1%

Mistral
<1%

阿里巴巴
<1%

Moonshot
<1%
OpenAI 93%
Anthropic 5.3%
谷歌 <1%
DeepSeek <1%
$938,331 交易量
$938,331 交易量

OpenAI
93%

Anthropic
5%

谷歌
1%

DeepSeek
<1%

xAI
<1%

Z.ai
<1%

Mistral
<1%

阿里巴巴
<1%

Moonshot
<1%
If two models are tied for the top LiveBench coding average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “coding average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
市场开放时间: Dec 12, 2025, 1:29 PM ET
Resolver
0x2F5e3684c...Resolver
0x2F5e3684c...OpenAI's o1 reasoning models dominate trader consensus at 93% implied probability for the best AI coding model by March 31, driven by their leading scores on key benchmarks like SWE-Bench Verified (48.9%) and LiveCodeBench, where o1-preview excels in complex, multi-step programming tasks over rivals like Claude 3.5 Sonnet. This edge stems from OpenAI's focus on chain-of-thought reasoning, enabling superior bug-fixing and code generation, bolstered by recent full o1-mini deployment and rumors of iterative updates ahead of the deadline. Supporting factors include OpenAI's rapid release cadence and massive compute resources. Realistic challenges include Anthropic's anticipated Claude 4 launch, Google's Gemini 2.0 advancements at upcoming I/O, or breakthroughs from cost-efficient open-source players like DeepSeek, any of which could shift leaderboards before resolution.
基于Polymarket数据的AI实验性摘要 · 更新于
警惕外部链接哦。
警惕外部链接哦。
常见问题