OpenAI's GPT-5.4 Thinking xHigh Effort model tops LiveBench's Mathematics Average leaderboard at 94.15% as of March 28, driving its 99.4% market-implied odds by showcasing unmatched reasoning on contamination-free math tasks like advanced algebra and competition problems. Recent high-effort optimizations in OpenAI's large language model have widened its lead over Google's Gemini 3.1 Pro Preview (91.04%) and Anthropic's Claude 4.6 Opus (89.32%), with no rival releases or evaluations in the past week altering standings. Traders' strong consensus reflects real-money bets on this gap persisting through the March 31 snapshot, though a surprise model drop from competitors like DeepSeek or xAI with superior benchmark scores could challenge it in the final days.
Experimentelle KI-generierte Zusammenfassung mit Polymarket-Daten · AktualisiertOpenAI 99.4%
xAI <1%
Google <1%
DeepSeek <1%
$474,193 Vol.
$474,193 Vol.

OpenAI
99%

xAI
<1%

<1%

DeepSeek
<1%

Anthropic
<1%

Z.ai
<1%

Mistral
<1%

Alibaba
<1%

Moonshot
<1%
OpenAI 99.4%
xAI <1%
Google <1%
DeepSeek <1%
$474,193 Vol.
$474,193 Vol.

OpenAI
99%

xAI
<1%

<1%

DeepSeek
<1%

Anthropic
<1%

Z.ai
<1%

Mistral
<1%

Alibaba
<1%

Moonshot
<1%
If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Markt eröffnet: Dec 12, 2025, 1:25 PM ET
Resolver
0x2F5e3684c...If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Resolver
0x2F5e3684c...OpenAI's GPT-5.4 Thinking xHigh Effort model tops LiveBench's Mathematics Average leaderboard at 94.15% as of March 28, driving its 99.4% market-implied odds by showcasing unmatched reasoning on contamination-free math tasks like advanced algebra and competition problems. Recent high-effort optimizations in OpenAI's large language model have widened its lead over Google's Gemini 3.1 Pro Preview (91.04%) and Anthropic's Claude 4.6 Opus (89.32%), with no rival releases or evaluations in the past week altering standings. Traders' strong consensus reflects real-money bets on this gap persisting through the March 31 snapshot, though a surprise model drop from competitors like DeepSeek or xAI with superior benchmark scores could challenge it in the final days.
Experimentelle KI-generierte Zusammenfassung mit Polymarket-Daten · Aktualisiert
Vorsicht bei externen Links.
Vorsicht bei externen Links.
Häufig gestellte Fragen