OpenAI's o1 reasoning models dominate math benchmarks like MATH (94.8% accuracy) and AIME 2024 (83.3%), far outpacing rivals such as Anthropic's Claude 3.5 Sonnet (88.8% on MATH) and Google's Gemini 2.0 Flash, driving trader consensus to a 98.1% implied probability for OpenAI on March 31 leaderboards. This lead stems from o1's chain-of-thought reasoning breakthroughs announced in September 2024, with no competing releases matching its capabilities in the past month. Sentiment reflects the wisdom of crowds in prediction markets, where real capital backs the view of sustained superiority absent major announcements. Challenges could arise from surprise drops like a Gemini 2.0 full release or xAI's Grok-3 before resolution, though historical AI launch timelines make this unlikely.
Экспериментальная сводка, созданная ИИ на основе данных Polymarket · ОбновленоВ какой компании 31 марта будет лучшая модель искусственного интеллекта для математики?
В какой компании 31 марта будет лучшая модель искусственного интеллекта для математики?
OpenAI 98.2%
Google <1%
Anthropic <1%
DeepSeek <1%
$412,686 Объем
$412,686 Объем

OpenAI
98%

1%

Anthropic
1%

DeepSeek
<1%

xAI
<1%

Z.ai
<1%

Alibaba
<1%

Moonshot
<1%

Mistral
<1%
OpenAI 98.2%
Google <1%
Anthropic <1%
DeepSeek <1%
$412,686 Объем
$412,686 Объем

OpenAI
98%

1%

Anthropic
1%

DeepSeek
<1%

xAI
<1%

Z.ai
<1%

Alibaba
<1%

Moonshot
<1%

Mistral
<1%
If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Открытие рынка: Dec 12, 2025, 1:25 PM ET
Resolver
0x2F5e3684c...Resolver
0x2F5e3684c...OpenAI's o1 reasoning models dominate math benchmarks like MATH (94.8% accuracy) and AIME 2024 (83.3%), far outpacing rivals such as Anthropic's Claude 3.5 Sonnet (88.8% on MATH) and Google's Gemini 2.0 Flash, driving trader consensus to a 98.1% implied probability for OpenAI on March 31 leaderboards. This lead stems from o1's chain-of-thought reasoning breakthroughs announced in September 2024, with no competing releases matching its capabilities in the past month. Sentiment reflects the wisdom of crowds in prediction markets, where real capital backs the view of sustained superiority absent major announcements. Challenges could arise from surprise drops like a Gemini 2.0 full release or xAI's Grok-3 before resolution, though historical AI launch timelines make this unlikely.
Экспериментальная сводка, созданная ИИ на основе данных Polymarket · Обновлено
Не доверяй внешним ссылкам.
Не доверяй внешним ссылкам.
Часто задаваемые вопросы