OpenAI commands a 93% implied probability on Polymarket for the best AI coding model by March 31, driven by its o1-preview and o1-mini models topping key benchmarks like SWE-Bench Verified (48.9% success rate) and LiveCodeBench, where advanced chain-of-thought reasoning excels in complex software engineering tasks. Trader consensus reflects OpenAI's rapid iteration cycle, unmatched compute scale via Microsoft partnership, and history of benchmark leadership, with no major rivals surpassing it recently despite Claude 3.5 Sonnet's strong showings. Challenges could arise from Anthropic's anticipated Claude 4 release, Google's Gemini 2.0 advancements, or unexpected open-source leaps from DeepSeek, potentially shifting leaderboards before the deadline.
Experimental AI-generated summary referencing Polymarket data · UpdatedOpenAI 93%
Anthropic 5.3%
Google <1%
DeepSeek <1%
$938,331 Vol.
$938,331 Vol.

OpenAI
93%

Anthropic
5%

1%

DeepSeek
<1%

xAI
<1%

Z.ai
<1%

Mistral
<1%

Alibaba
<1%

Moonshot
<1%
OpenAI 93%
Anthropic 5.3%
Google <1%
DeepSeek <1%
$938,331 Vol.
$938,331 Vol.

OpenAI
93%

Anthropic
5%

1%

DeepSeek
<1%

xAI
<1%

Z.ai
<1%

Mistral
<1%

Alibaba
<1%

Moonshot
<1%
If two models are tied for the top LiveBench coding average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “coding average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Market Opened: Dec 12, 2025, 1:29 PM ET
Resolver
0x2F5e3684c...Resolver
0x2F5e3684c...OpenAI commands a 93% implied probability on Polymarket for the best AI coding model by March 31, driven by its o1-preview and o1-mini models topping key benchmarks like SWE-Bench Verified (48.9% success rate) and LiveCodeBench, where advanced chain-of-thought reasoning excels in complex software engineering tasks. Trader consensus reflects OpenAI's rapid iteration cycle, unmatched compute scale via Microsoft partnership, and history of benchmark leadership, with no major rivals surpassing it recently despite Claude 3.5 Sonnet's strong showings. Challenges could arise from Anthropic's anticipated Claude 4 release, Google's Gemini 2.0 advancements, or unexpected open-source leaps from DeepSeek, potentially shifting leaderboards before the deadline.
Experimental AI-generated summary referencing Polymarket data · Updated
Beware of external links.
Beware of external links.
Frequently Asked Questions