Anthropic's Claude Opus 4.6 Thinking model recently surged to the top of Arena.ai's Text leaderboard with a 1504 Elo score, surpassing Google's Gemini 3.1 Pro Preview (1493) and xAI's new Grok 4.20 Beta (1491 preliminary), based on millions of crowdsourced blind battles evaluating large language model capabilities in reasoning, creative writing, and expert prompts. This reflects rapid iterative progress from frontier AI labs since early 2026, including OpenAI's GPT-5.4 High entering the top 10 in March, amid competitive dynamics pressuring higher benchmarks. Traders should monitor Q2 announcements like potential GPT-5.5 or Gemini 4.0 releases, regulatory scrutiny on AI safety, and risks of leaderboard saturation, as scores could climb toward 1550+ by year-end resolution on the style-unchecked Overall category.
基于Polymarket数据的AI实验性摘要 · 更新于$65,865 交易量
↑ 1550
59%
↑ 1600
27%
↑ 1650
13%
↑ 1700
11%
$65,865 交易量
↑ 1550
59%
↑ 1600
27%
↑ 1650
13%
↑ 1700
11%
Results from the 'Score' section on the 'Text Arena' Leaderboard tab (https://lmarena.ai/leaderboard/text), with the style control unchecked, will be used to resolve this market.
The resolution source is the Chatbot Arena LLM Leaderboard (https://lmarena.ai/). If this source is temporarily unavailable, the market remains open until it is accessible again; if permanently unavailable, this market will resolve to "No".
市场开放时间: Jan 2, 2026, 1:29 PM ET
Resolver
0x65070BE91...已提议结果: 是
无争议
最终结果: 是
Results from the 'Score' section on the 'Text Arena' Leaderboard tab (https://lmarena.ai/leaderboard/text), with the style control unchecked, will be used to resolve this market.
The resolution source is the Chatbot Arena LLM Leaderboard (https://lmarena.ai/). If this source is temporarily unavailable, the market remains open until it is accessible again; if permanently unavailable, this market will resolve to "No".
Resolver
0x65070BE91...已提议结果: 是
无争议
最终结果: 是
Anthropic's Claude Opus 4.6 Thinking model recently surged to the top of Arena.ai's Text leaderboard with a 1504 Elo score, surpassing Google's Gemini 3.1 Pro Preview (1493) and xAI's new Grok 4.20 Beta (1491 preliminary), based on millions of crowdsourced blind battles evaluating large language model capabilities in reasoning, creative writing, and expert prompts. This reflects rapid iterative progress from frontier AI labs since early 2026, including OpenAI's GPT-5.4 High entering the top 10 in March, amid competitive dynamics pressuring higher benchmarks. Traders should monitor Q2 announcements like potential GPT-5.5 or Gemini 4.0 releases, regulatory scrutiny on AI safety, and risks of leaderboard saturation, as scores could climb toward 1550+ by year-end resolution on the style-unchecked Overall category.
基于Polymarket数据的AI实验性摘要 · 更新于
警惕外部链接哦。
警惕外部链接哦。
常见问题