Trader consensus on Polymarket assigns a 97.5% implied probability to claude-opus-4-6-thinking as the top AI model on the LMSYS Chatbot Arena leaderboard (Style Control Off) as of April 10, driven by Anthropic's Claude Opus 4.6 release in early February 2026, which rapidly claimed the #1 spot across text, coding, and reasoning categories with superior Elo scores and structured reasoning hierarchies. Recent developer reports highlight its edge over GPT-5.4 in benchmarks like SWE-Bench Verified (80.8%) and 1M-token context for complex tasks, solidifying its lead amid thin margins in frontier large language models. With the leaderboard snapshot just five days away, challengers like Gemini 3 Pro or Grok 4.20 beta1 would require an unprecedented surge in user battle wins, unlikely without a surprise model preview.
Résumé expérimental généré par IA à partir des données Polymarket · Mis à jourclaude-opus-4-6-thinking 97.8%
claude-opus-4-6 <1%
qwen3.5-max-preview <1%
gemini-3-pro <1%
$15,113 Vol.
$15,113 Vol.
claude-opus-4-6-thinking
98%
claude-opus-4-6
1%
qwen3.5-max-preview
<1%
gemini-3-pro
<1%
kimi-k2.5-thinking
<1%
grok-4.20-beta1
<1%
dola-seed-2.0-preview
<1%
gpt-5.4-high
<1%
gemini-3.1-pro-preview
<1%
gemini-3-flash
<1%
gemini-2.5-pro
<1%
claude-opus-4-6-thinking 97.8%
claude-opus-4-6 <1%
qwen3.5-max-preview <1%
gemini-3-pro <1%
$15,113 Vol.
$15,113 Vol.
claude-opus-4-6-thinking
98%
claude-opus-4-6
1%
qwen3.5-max-preview
<1%
gemini-3-pro
<1%
kimi-k2.5-thinking
<1%
grok-4.20-beta1
<1%
dola-seed-2.0-preview
<1%
gpt-5.4-high
<1%
gemini-3.1-pro-preview
<1%
gemini-3-flash
<1%
gemini-2.5-pro
<1%
Results from the "Score" column under the "Text Arena | Overall" Leaderboard tab at https://lmarena.ai/leaderboard/text with style control off will be used to resolve this market.
Models will be ranked primarily by their arena score at this market’s check time, with alphabetical order of model names as listed in this market group (full string, including suffixes such as “-thinking”) used as a tiebreaker (e.g., if the two models are tied by arena score, “claude-opus-4-6” would be ranked ahead of “claude-opus-4-6-thinking”). This market will resolve based on the model that occupies first place under this ranking.
The resolution source for this market is the Chatbot Arena LLM Leaderboard found at https://lmarena.ai/. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Marché ouvert : Apr 2, 2026, 5:57 PM ET
Resolver
0x69c47De9D...Results from the "Score" column under the "Text Arena | Overall" Leaderboard tab at https://lmarena.ai/leaderboard/text with style control off will be used to resolve this market.
Models will be ranked primarily by their arena score at this market’s check time, with alphabetical order of model names as listed in this market group (full string, including suffixes such as “-thinking”) used as a tiebreaker (e.g., if the two models are tied by arena score, “claude-opus-4-6” would be ranked ahead of “claude-opus-4-6-thinking”). This market will resolve based on the model that occupies first place under this ranking.
The resolution source for this market is the Chatbot Arena LLM Leaderboard found at https://lmarena.ai/. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Resolver
0x69c47De9D...Trader consensus on Polymarket assigns a 97.5% implied probability to claude-opus-4-6-thinking as the top AI model on the LMSYS Chatbot Arena leaderboard (Style Control Off) as of April 10, driven by Anthropic's Claude Opus 4.6 release in early February 2026, which rapidly claimed the #1 spot across text, coding, and reasoning categories with superior Elo scores and structured reasoning hierarchies. Recent developer reports highlight its edge over GPT-5.4 in benchmarks like SWE-Bench Verified (80.8%) and 1M-token context for complex tasks, solidifying its lead amid thin margins in frontier large language models. With the leaderboard snapshot just five days away, challengers like Gemini 3 Pro or Grok 4.20 beta1 would require an unprecedented surge in user battle wins, unlikely without a surprise model preview.
Résumé expérimental généré par IA à partir des données Polymarket · Mis à jour
Méfiez-vous des liens externes.
Méfiez-vous des liens externes.
Questions fréquentes