OpenAI's latest large language models, including GPT-5.4 Pro, dominate the LiveBench Mathematics Average leaderboard—the key resolution source—with a weighted score of 98.3%, outpacing Anthropic's Claude Opus 4.6 at 97.3% as of the March 18 update. Recent benchmark runs confirm near-perfect performance on saturated tests like AIME (100%) and MATH-500 (99.4%), reflecting superior chain-of-thought reasoning capabilities that have solidified trader consensus at 99.5% implied probability. This positioning stems from OpenAI's iterative releases widening the gap over competitors amid a quiet week for rival announcements. With resolution imminent on March 31, a surprise model drop from Google DeepMind or xAI topping LiveBench math metrics remains the sole realistic upset scenario, though none are signaled.
Résumé expérimental généré par IA à partir des données Polymarket · Mis à jourOpenAI 99.4%
xAI <1%
Google <1%
DeepSeek <1%
$474,205 Vol.
$474,205 Vol.

OpenAI
99%

xAI
<1%

<1%

DeepSeek
<1%

Anthropic
<1%

Z.ai
<1%

Mistral
<1%

Alibaba
<1%

Moonshot
<1%
OpenAI 99.4%
xAI <1%
Google <1%
DeepSeek <1%
$474,205 Vol.
$474,205 Vol.

OpenAI
99%

xAI
<1%

<1%

DeepSeek
<1%

Anthropic
<1%

Z.ai
<1%

Mistral
<1%

Alibaba
<1%

Moonshot
<1%
If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Marché ouvert : Dec 12, 2025, 1:25 PM ET
Resolver
0x2F5e3684c...If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Resolver
0x2F5e3684c...OpenAI's latest large language models, including GPT-5.4 Pro, dominate the LiveBench Mathematics Average leaderboard—the key resolution source—with a weighted score of 98.3%, outpacing Anthropic's Claude Opus 4.6 at 97.3% as of the March 18 update. Recent benchmark runs confirm near-perfect performance on saturated tests like AIME (100%) and MATH-500 (99.4%), reflecting superior chain-of-thought reasoning capabilities that have solidified trader consensus at 99.5% implied probability. This positioning stems from OpenAI's iterative releases widening the gap over competitors amid a quiet week for rival announcements. With resolution imminent on March 31, a surprise model drop from Google DeepMind or xAI topping LiveBench math metrics remains the sole realistic upset scenario, though none are signaled.
Résumé expérimental généré par IA à partir des données Polymarket · Mis à jour
Méfiez-vous des liens externes.
Méfiez-vous des liens externes.
Questions fréquentes