Anthropic's Claude Opus 4.6 Thinking model recently surged to the top of Arena.ai's Text leaderboard with a 1504 Elo score, surpassing Google's Gemini 3.1 Pro Preview (1493) and xAI's new Grok 4.20 Beta (1491 preliminary), based on millions of crowdsourced blind battles evaluating large language model capabilities in reasoning, creative writing, and expert prompts. This reflects rapid iterative progress from frontier AI labs since early 2026, including OpenAI's GPT-5.4 High entering the top 10 in March, amid competitive dynamics pressuring higher benchmarks. Traders should monitor Q2 announcements like potential GPT-5.5 or Gemini 4.0 releases, regulatory scrutiny on AI safety, and risks of leaderboard saturation, as scores could climb toward 1550+ by year-end resolution on the style-unchecked Overall category.
Résumé expérimental généré par IA à partir des données Polymarket · Mis à jour$65,865 Vol.
↑ 1550
59%
↑ 1600
27%
↑ 1650
13%
↑ 1700
11%
$65,865 Vol.
↑ 1550
59%
↑ 1600
27%
↑ 1650
13%
↑ 1700
11%
Results from the 'Score' section on the 'Text Arena' Leaderboard tab (https://lmarena.ai/leaderboard/text), with the style control unchecked, will be used to resolve this market.
The resolution source is the Chatbot Arena LLM Leaderboard (https://lmarena.ai/). If this source is temporarily unavailable, the market remains open until it is accessible again; if permanently unavailable, this market will resolve to "No".
Marché ouvert : Jan 2, 2026, 1:29 PM ET
Resolver
0x65070BE91...Résultat proposé: Oui
Aucune contestation
Résultat final: Oui
Results from the 'Score' section on the 'Text Arena' Leaderboard tab (https://lmarena.ai/leaderboard/text), with the style control unchecked, will be used to resolve this market.
The resolution source is the Chatbot Arena LLM Leaderboard (https://lmarena.ai/). If this source is temporarily unavailable, the market remains open until it is accessible again; if permanently unavailable, this market will resolve to "No".
Resolver
0x65070BE91...Résultat proposé: Oui
Aucune contestation
Résultat final: Oui
Anthropic's Claude Opus 4.6 Thinking model recently surged to the top of Arena.ai's Text leaderboard with a 1504 Elo score, surpassing Google's Gemini 3.1 Pro Preview (1493) and xAI's new Grok 4.20 Beta (1491 preliminary), based on millions of crowdsourced blind battles evaluating large language model capabilities in reasoning, creative writing, and expert prompts. This reflects rapid iterative progress from frontier AI labs since early 2026, including OpenAI's GPT-5.4 High entering the top 10 in March, amid competitive dynamics pressuring higher benchmarks. Traders should monitor Q2 announcements like potential GPT-5.5 or Gemini 4.0 releases, regulatory scrutiny on AI safety, and risks of leaderboard saturation, as scores could climb toward 1550+ by year-end resolution on the style-unchecked Overall category.
Résumé expérimental généré par IA à partir des données Polymarket · Mis à jour
Méfiez-vous des liens externes.
Méfiez-vous des liens externes.
Questions fréquentes