OpenAI's o1 reasoning models, released in late 2024, currently dominate coding benchmarks like the LMSYS Chatbot Arena and HumanEval, showcasing superior step-by-step problem-solving that translates to real-world programming tasks and driving trader consensus to a 94.8% implied probability for leading on March 31. This edge stems from o1's chain-of-thought capabilities outperforming rivals including Anthropic's Claude 3.5 Sonnet, with recent blind evaluations confirming OpenAI's competitive positioning amid accelerating AI model races. While regulatory scrutiny on AI safety and compute resources adds uncertainty, realistic challengers include an Anthropic Claude 4.0 drop, Google's Gemini 2.0 advancements, or xAI's Grok 3—though release timelines remain speculative and OpenAI's scaling momentum favors it ahead of potential Q1 2025 launches.
基于Polymarket数据的AI实验性摘要 · 更新于OpenAI 94.9%
Anthropic 3.4%
DeepSeek <1%
谷歌 <1%
$1,029,158 交易量
$1,029,158 交易量

OpenAI
95%

Anthropic
3%

DeepSeek
1%

谷歌
<1%

xAI
<1%

Z.ai
<1%

Mistral
<1%

阿里巴巴
<1%

Moonshot
<1%
OpenAI 94.9%
Anthropic 3.4%
DeepSeek <1%
谷歌 <1%
$1,029,158 交易量
$1,029,158 交易量

OpenAI
95%

Anthropic
3%

DeepSeek
1%

谷歌
<1%

xAI
<1%

Z.ai
<1%

Mistral
<1%

阿里巴巴
<1%

Moonshot
<1%
If two models are tied for the top LiveBench coding average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “coding average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
市场开放时间: Dec 12, 2025, 1:29 PM ET
Resolver
0x2F5e3684c...If two models are tied for the top LiveBench coding average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “coding average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Resolver
0x2F5e3684c...OpenAI's o1 reasoning models, released in late 2024, currently dominate coding benchmarks like the LMSYS Chatbot Arena and HumanEval, showcasing superior step-by-step problem-solving that translates to real-world programming tasks and driving trader consensus to a 94.8% implied probability for leading on March 31. This edge stems from o1's chain-of-thought capabilities outperforming rivals including Anthropic's Claude 3.5 Sonnet, with recent blind evaluations confirming OpenAI's competitive positioning amid accelerating AI model races. While regulatory scrutiny on AI safety and compute resources adds uncertainty, realistic challengers include an Anthropic Claude 4.0 drop, Google's Gemini 2.0 advancements, or xAI's Grok 3—though release timelines remain speculative and OpenAI's scaling momentum favors it ahead of potential Q1 2025 launches.
基于Polymarket数据的AI实验性摘要 · 更新于
警惕外部链接哦。
警惕外部链接哦。
常见问题