OpenAI's o1-preview and o1-mini large language models dominate trader consensus at 98.8% implied probability for leading math benchmarks on March 31, driven by their state-of-the-art performance on rigorous tests like AIME 2024 (83.3% accuracy) and the MATH dataset (94.1%), leveraging advanced chain-of-thought reasoning and test-time compute that outpace rivals such as Anthropic's Claude 3.5 Sonnet or DeepSeek's open models. No credible announcements or leaks indicate competitors closing the gap in the past week, with typical AI model training and release cycles spanning months. Upcoming catalysts include potential Gemini 2.0 updates from Google or xAI's Grok-3, but delays in safety evaluations or compute constraints could preserve OpenAI's edge; a surprise benchmark-topping release from any underdog remains the primary upset risk.
Экспериментальная сводка, созданная ИИ на основе данных Polymarket · ОбновленоВ какой компании 31 марта будет лучшая модель искусственного интеллекта для математики?
В какой компании 31 марта будет лучшая модель искусственного интеллекта для математики?
OpenAI 99.1%
Google <1%
DeepSeek <1%
Anthropic <1%
$437,673 Объем
$437,673 Объем

OpenAI
99%

1%

DeepSeek
<1%

Anthropic
<1%

xAI
<1%

Moonshot
<1%

Z.ai
<1%

Mistral
<1%

Alibaba
<1%
OpenAI 99.1%
Google <1%
DeepSeek <1%
Anthropic <1%
$437,673 Объем
$437,673 Объем

OpenAI
99%

1%

DeepSeek
<1%

Anthropic
<1%

xAI
<1%

Moonshot
<1%

Z.ai
<1%

Mistral
<1%

Alibaba
<1%
If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Открытие рынка: Dec 12, 2025, 1:25 PM ET
Resolver
0x2F5e3684c...If two models are tied for the highest LiveBench Mathematics Average score at this market's check time, resolution will be based on whichever company's name, as it is described in this market group, comes first in alphabetical order.
The primary source of resolution for this market will be LiveBench’s AI leaderboard, specifically the “Mathematics Average” category, found at livebench.ai. If this resolution source is unavailable at check time, this market will remain open until the LiveBench AI leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Resolver
0x2F5e3684c...OpenAI's o1-preview and o1-mini large language models dominate trader consensus at 98.8% implied probability for leading math benchmarks on March 31, driven by their state-of-the-art performance on rigorous tests like AIME 2024 (83.3% accuracy) and the MATH dataset (94.1%), leveraging advanced chain-of-thought reasoning and test-time compute that outpace rivals such as Anthropic's Claude 3.5 Sonnet or DeepSeek's open models. No credible announcements or leaks indicate competitors closing the gap in the past week, with typical AI model training and release cycles spanning months. Upcoming catalysts include potential Gemini 2.0 updates from Google or xAI's Grok-3, but delays in safety evaluations or compute constraints could preserve OpenAI's edge; a surprise benchmark-topping release from any underdog remains the primary upset risk.
Экспериментальная сводка, созданная ИИ на основе данных Polymarket · Обновлено
Не доверяй внешним ссылкам.
Не доверяй внешним ссылкам.
Часто задаваемые вопросы