OpenAI's o1 reasoning models, released in September 2024, have propelled the LMSYS Chatbot Arena leaderboard, with top Elo scores surpassing 1350 as users prefer their chain-of-thought capabilities in blind A/B tests over prior leaders like Claude 3.5 Sonnet. This marks a rapid climb from mid-1200s earlier in the year, driven by advances in artificial intelligence reasoning that enhance performance on complex queries. Competitive pressure from Anthropic, Google DeepMind, and xAI continues to accelerate improvements, with trader consensus reflecting expectations of further gains by December 31 amid rumors of next-gen releases like potential GPT-5 or Claude 4. Watch for official model launches and benchmark updates, as Arena scores hinge on real-user votes capturing live AI capabilities.
Résumé expérimental généré par IA à partir des données Polymarket · Mis à jour$65,842 Vol.
↑ 1550
59%
↑ 1600
27%
↑ 1650
13%
↑ 1700
11%
$65,842 Vol.
↑ 1550
59%
↑ 1600
27%
↑ 1650
13%
↑ 1700
11%
Results from the 'Score' section on the 'Text Arena' Leaderboard tab (https://lmarena.ai/leaderboard/text), with the style control unchecked, will be used to resolve this market.
The resolution source is the Chatbot Arena LLM Leaderboard (https://lmarena.ai/). If this source is temporarily unavailable, the market remains open until it is accessible again; if permanently unavailable, this market will resolve to "No".
Marché ouvert : Jan 2, 2026, 1:29 PM ET
Resolver
0x65070BE91...Results from the 'Score' section on the 'Text Arena' Leaderboard tab (https://lmarena.ai/leaderboard/text), with the style control unchecked, will be used to resolve this market.
The resolution source is the Chatbot Arena LLM Leaderboard (https://lmarena.ai/). If this source is temporarily unavailable, the market remains open until it is accessible again; if permanently unavailable, this market will resolve to "No".
Resolver
0x65070BE91...OpenAI's o1 reasoning models, released in September 2024, have propelled the LMSYS Chatbot Arena leaderboard, with top Elo scores surpassing 1350 as users prefer their chain-of-thought capabilities in blind A/B tests over prior leaders like Claude 3.5 Sonnet. This marks a rapid climb from mid-1200s earlier in the year, driven by advances in artificial intelligence reasoning that enhance performance on complex queries. Competitive pressure from Anthropic, Google DeepMind, and xAI continues to accelerate improvements, with trader consensus reflecting expectations of further gains by December 31 amid rumors of next-gen releases like potential GPT-5 or Claude 4. Watch for official model launches and benchmark updates, as Arena scores hinge on real-user votes capturing live AI capabilities.
Résumé expérimental généré par IA à partir des données Polymarket · Mis à jour
Méfiez-vous des liens externes.
Méfiez-vous des liens externes.
Questions fréquentes