OpenAI's GPT-5.4, launched earlier this month, has solidified trader consensus at a 97.1% implied probability for leading AI coding model by March 31, thanks to record scores on key benchmarks like SWE-Bench Verified and Terminal-Bench, where it excels in agentic tasks such as debugging and multi-step code generation. This unified GPT-Codex architecture outperforms rivals in real-world software engineering evaluations, per developer reports and OpenAI's demonstrations, amid no major competitor releases in the past week. Anthropic's Claude Opus 4.6 trails closely on some leaderboards, but traders see limited upside. Realistic challenges include a surprise Google Gemini update or benchmark reevaluation before resolution, though proximity to deadline tempers volatility.
Résumé expérimental généré par IA à partir des données Polymarket · Mis à jourOpenAI 97.0%
Anthropic 1.7%
Google <1%
DeepSeek <1%
$1,057,610 Vol.
$1,057,610 Vol.

OpenAI
97%

Anthropic
2%

1%

DeepSeek
1%

xAI
<1%

Z.ai
<1%

Mistral
<1%

Alibaba
<1%

Moonshot
<1%
OpenAI 97.0%
Anthropic 1.7%
Google <1%
DeepSeek <1%
$1,057,610 Vol.
$1,057,610 Vol.

OpenAI
97%

Anthropic
2%

1%

DeepSeek
1%

xAI
<1%

Z.ai
<1%

Mistral
<1%

Alibaba
<1%

Moonshot
<1%
If two models are tied for the top LiveBench coding average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “coding average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Marché ouvert : Dec 12, 2025, 1:29 PM ET
Resolver
0x2F5e3684c...If two models are tied for the top LiveBench coding average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “coding average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Resolver
0x2F5e3684c...OpenAI's GPT-5.4, launched earlier this month, has solidified trader consensus at a 97.1% implied probability for leading AI coding model by March 31, thanks to record scores on key benchmarks like SWE-Bench Verified and Terminal-Bench, where it excels in agentic tasks such as debugging and multi-step code generation. This unified GPT-Codex architecture outperforms rivals in real-world software engineering evaluations, per developer reports and OpenAI's demonstrations, amid no major competitor releases in the past week. Anthropic's Claude Opus 4.6 trails closely on some leaderboards, but traders see limited upside. Realistic challenges include a surprise Google Gemini update or benchmark reevaluation before resolution, though proximity to deadline tempers volatility.
Résumé expérimental généré par IA à partir des données Polymarket · Mis à jour
Méfiez-vous des liens externes.
Méfiez-vous des liens externes.
Questions fréquentes