OpenAI's GPT-5.4 model commands 100% trader consensus as the best AI for coding on March 31, propelled by its state-of-the-art performance across key benchmarks like SWE-bench Verified (80%) and Terminal-Bench 2.0 (75.1%), where specialized Codex variants excel in agentic tasks such as multi-step reasoning and real-world software engineering. Released March 5, this large language model has sustained a narrow lead over Anthropic's Claude Opus 4.6 (80.8% SWE-bench but trailing in composites) and Google's Gemini 3.1 Pro Preview, reflecting developer acclaim for superior code generation and tool integration in arenas like Artificial Analysis Intelligence Index (tied top at 57). While prediction markets aggregate skin-in-the-game wisdom, a last-minute leaderboard update or surprise rival release could challenge this positioning, though none materialized today.
Polymarketデータを参照したAI生成の実験的な要約 · 更新日OpenAI 100.0%
Google <1%
Z.ai <1%
DeepSeek <1%
$1,351,043 Vol.
$1,351,043 Vol.

いいえ

OpenAI
はい

Z.ai
いいえ

DeepSeek
いいえ

Mistral
いいえ

Anthropic
いいえ

アリババ
いいえ

xAI
いいえ

ムーンショット
いいえ
OpenAI 100.0%
Google <1%
Z.ai <1%
DeepSeek <1%
$1,351,043 Vol.
$1,351,043 Vol.

いいえ

OpenAI
はい

Z.ai
いいえ

DeepSeek
いいえ

Mistral
いいえ

Anthropic
いいえ

アリババ
いいえ

xAI
いいえ

ムーンショット
いいえ
If two models are tied for the top LiveBench coding average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “coding average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
マーケット開始日: Dec 12, 2025, 1:29 PM ET
Resolver
0x2F5e3684c...提案された結果: はい
異議申し立てなし
最終結果: はい
If two models are tied for the top LiveBench coding average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “coding average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Resolver
0x2F5e3684c...提案された結果: はい
異議申し立てなし
最終結果: はい
OpenAI's GPT-5.4 model commands 100% trader consensus as the best AI for coding on March 31, propelled by its state-of-the-art performance across key benchmarks like SWE-bench Verified (80%) and Terminal-Bench 2.0 (75.1%), where specialized Codex variants excel in agentic tasks such as multi-step reasoning and real-world software engineering. Released March 5, this large language model has sustained a narrow lead over Anthropic's Claude Opus 4.6 (80.8% SWE-bench but trailing in composites) and Google's Gemini 3.1 Pro Preview, reflecting developer acclaim for superior code generation and tool integration in arenas like Artificial Analysis Intelligence Index (tied top at 57). While prediction markets aggregate skin-in-the-game wisdom, a last-minute leaderboard update or surprise rival release could challenge this positioning, though none materialized today.
Polymarketデータを参照したAI生成の実験的な要約 · 更新日
外部リンクに注意してください。
外部リンクに注意してください。
よくある質問