OpenAI's o1 reasoning model commands a 93.5% implied probability of having the top math-performing AI by March 31, driven by its unmatched benchmark scores—83% on the challenging MATH dataset and 94.8% on AIME 2024—far surpassing rivals like Anthropic's Claude 3.5 Sonnet (around 70% on MATH) and Google's Gemini 1.5 Pro. Trader consensus reflects OpenAI's rapid iteration edge and lack of confirmed superior releases from competitors, with xAI's Grok-3 still in training and DeepSeek's math-focused models trailing in comprehensive evals. Challenges could arise from surprise launches, such as an early Grok-3 beta or Anthropic's next Claude iteration exceeding o1 on key metrics like GSM8K or GPQA before the deadline.
Экспериментальная сводка, созданная ИИ на основе данных Polymarket · ОбновленоВ какой компании 31 марта будет лучшая модель искусственного интеллекта для математики?
В какой компании 31 марта будет лучшая модель искусственного интеллекта для математики?
OpenAI 94%
xAI 1.7%
DeepSeek 1.6%
Google 1.0%
$145,484 Объем
$145,484 Объем

OpenAI
94%

xAI
2%

DeepSeek
2%

1%

Anthropic
1%

Moonshot
1%

Z.ai
<1%

Alibaba
<1%

Mistral
<1%
OpenAI 94%
xAI 1.7%
DeepSeek 1.6%
Google 1.0%
$145,484 Объем
$145,484 Объем

OpenAI
94%

xAI
2%

DeepSeek
2%

1%

Anthropic
1%

Moonshot
1%

Z.ai
<1%

Alibaba
<1%

Mistral
<1%
If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Открытие рынка: Dec 12, 2025, 1:25 PM ET
Resolver
0x2F5e3684c...Resolver
0x2F5e3684c...OpenAI's o1 reasoning model commands a 93.5% implied probability of having the top math-performing AI by March 31, driven by its unmatched benchmark scores—83% on the challenging MATH dataset and 94.8% on AIME 2024—far surpassing rivals like Anthropic's Claude 3.5 Sonnet (around 70% on MATH) and Google's Gemini 1.5 Pro. Trader consensus reflects OpenAI's rapid iteration edge and lack of confirmed superior releases from competitors, with xAI's Grok-3 still in training and DeepSeek's math-focused models trailing in comprehensive evals. Challenges could arise from surprise launches, such as an early Grok-3 beta or Anthropic's next Claude iteration exceeding o1 on key metrics like GSM8K or GPQA before the deadline.
Экспериментальная сводка, созданная ИИ на основе данных Polymarket · Обновлено
Не доверяй внешним ссылкам.
Не доверяй внешним ссылкам.
Часто задаваемые вопросы