Trader consensus on the LMSYS Chatbot Arena leaderboard— a crowdsourced benchmark ranking large language models via blind user votes on ELO scores—reflects intensifying competition among AI labs, with Anthropic's Claude 3.5 Sonnet leading at around 1286 ELO following its June 2024 release and subsequent fine-tunes. OpenAI's o1 reasoning models and GPT-4o trail closely, buoyed by demonstrated gains in complex tasks, while Meta's open-source Llama 3.1 405B has climbed via community optimizations. Recent catalysts include iterative updates and leaks hinting at year-end contenders like xAI's Grok-3 and Google's Gemini 2.0, potentially elevating top scores past 1300 ELO by December 31 amid escalating capabilities in coding, math, and multimodal reasoning. Watch OpenAI's DevDay announcements and Anthropic safety evals for momentum shifts.
Экспериментальная сводка, созданная ИИ на основе данных Polymarket · ОбновленоChatbot Arena: Насколько высоко ИИ наберет к 31 декабря?
Chatbot Arena: Насколько высоко ИИ наберет к 31 декабря?
$65,862 Объем
↑ 1550
59%
↑ 1600
28%
↑ 1650
13%
↑ 1700
11%
$65,862 Объем
↑ 1550
59%
↑ 1600
28%
↑ 1650
13%
↑ 1700
11%
Results from the 'Score' section on the 'Text Arena' Leaderboard tab (https://lmarena.ai/leaderboard/text), with the style control unchecked, will be used to resolve this market.
The resolution source is the Chatbot Arena LLM Leaderboard (https://lmarena.ai/). If this source is temporarily unavailable, the market remains open until it is accessible again; if permanently unavailable, this market will resolve to "No".
Открытие рынка: Jan 2, 2026, 1:29 PM ET
Resolver
0x65070BE91...Results from the 'Score' section on the 'Text Arena' Leaderboard tab (https://lmarena.ai/leaderboard/text), with the style control unchecked, will be used to resolve this market.
The resolution source is the Chatbot Arena LLM Leaderboard (https://lmarena.ai/). If this source is temporarily unavailable, the market remains open until it is accessible again; if permanently unavailable, this market will resolve to "No".
Resolver
0x65070BE91...Trader consensus on the LMSYS Chatbot Arena leaderboard— a crowdsourced benchmark ranking large language models via blind user votes on ELO scores—reflects intensifying competition among AI labs, with Anthropic's Claude 3.5 Sonnet leading at around 1286 ELO following its June 2024 release and subsequent fine-tunes. OpenAI's o1 reasoning models and GPT-4o trail closely, buoyed by demonstrated gains in complex tasks, while Meta's open-source Llama 3.1 405B has climbed via community optimizations. Recent catalysts include iterative updates and leaks hinting at year-end contenders like xAI's Grok-3 and Google's Gemini 2.0, potentially elevating top scores past 1300 ELO by December 31 amid escalating capabilities in coding, math, and multimodal reasoning. Watch OpenAI's DevDay announcements and Anthropic safety evals for momentum shifts.
Экспериментальная сводка, созданная ИИ на основе данных Polymarket · Обновлено
Не доверяй внешним ссылкам.
Не доверяй внешним ссылкам.
Часто задаваемые вопросы