
$136.71K
1
11

$136.71K
1
11
Trader mode: Actionable analysis for identifying opportunities and edge
This market will resolve according to the company that owns the model with the second-highest arena score based on the Chatbot Arena LLM Leaderboard when the table under the "Leaderboard" tab is checked on February 28, 2026, 12:00 PM ET. Results from the "Arena Score" section on the Leaderboard tab of https://lmarena.ai/leaderboard/text set to default (style control on) will be used to resolve this market. Models will be ranked primarily by their arena score at this market’s check time, with a
Prediction markets currently give Anthropic an 81% chance of having the second-ranked AI model by the end of February. This means traders see it as very likely, roughly a 4 in 5 probability. The market is focused on the "Chatbot Arena" leaderboard, a popular public ranking where AI models are tested and scored by users. Being ranked #2 is a public signal of being a top competitor, just behind the leader.
The high confidence in Anthropic stems from its consistent performance. Its Claude models have been near the top of public leaderboards for over a year, often trading the #2 spot with Google's Gemini. Recently, Anthropic released Claude 3.5 Sonnet, which received strong reviews for its reasoning and coding ability, likely solidifying its competitive position.
A key factor is the perceived gap between the top companies and the rest. OpenAI's models, especially GPT-4 and its successors, have consistently held the #1 rank. The real contest has been for second place between Anthropic and Google. Markets are betting that Anthropic's focused strategy on AI safety and capability has given it a slight, stable edge over Google's efforts for this specific ranking period.
The resolution date is fixed for February 28, 2026, when the Arena leaderboard will be checked. The ranking is based on a rolling evaluation, so a last-minute model release could theoretically change the outcome. However, developing and deploying a top-tier model takes months, not days. The more relevant period to watch is the next month or two, as companies like Google or Meta could announce new models that begin climbing the leaderboard well before the February deadline.
Prediction markets are generally good at aggregating information about publicly trackable outcomes like leaderboard rankings. The Chatbot Arena score is a clear, objective metric. For similar "horse race" questions in technology, markets have a decent track record. The main limitation here is time: the resolution is nearly two years away, which is an eternity in AI. A lot of technical progress and strategic shifts can happen, making today's high probability an informed guess about the current trajectory, not a guaranteed future outcome.
Prediction markets assign an 81% probability that Anthropic will own the second-ranked AI model by the end of February 2026. This price, trading at 81¢ on Polymarket, indicates traders view the outcome as highly probable. The market resolves based on the public Chatbot Arena leaderboard, which ranks models by an "Arena Score" derived from anonymous user votes. With $137,000 in total volume, the market has sufficient liquidity for its size, suggesting informed participants are backing their views with capital.
The high confidence in Anthropic stems directly from the current competitive landscape. As of February 2026, Anthropic's Claude 3.5 Sonnet model consistently holds the #2 position on the Arena leaderboard, just behind OpenAI's GPT-4. This ranking has been stable for months, demonstrating resilience against new releases from competitors like Google's Gemini or Meta's Llama. The market is effectively pricing in continuity, betting that no competitor will execute a significant enough leap in the final week to dislodge Claude from its established spot. The specific "style control on" filter for the market also advantages Anthropic, as Claude models are often praised for their consistent, controlled outputs in comparative testing.
A last-minute shift is unlikely but possible. The primary risk is an unannounced model release or a major leaderboard update in the final days before the February 28th resolution. A competitor like Google DeepMind could push a new Gemini iteration, or a well-funded open-source project could surge in user ratings. However, the short timeframe works against such volatility. Leaderboard scores are based on aggregated user evaluations, which require time for sufficient voting volume to alter the mean score meaningfully. A new model would need not only to launch but to immediately achieve superior performance ratings from thousands of users within days, a historically rare event. The market's 81% price reflects this low probability of a last-minute upset.
AI-generated analysis based on market data. Not financial advice.
This prediction market focuses on identifying which company will possess the second-ranked artificial intelligence model according to the Chatbot Arena LLM Leaderboard at the end of February 2026. The resolution is based on the 'Arena Score' from the leaderboard's 'Style Control On' setting, which ranks models based on anonymous, crowdsourced human evaluations of their conversational abilities. The market specifically tracks the company behind the model with the second-highest score, not the model's name itself. This creates a direct proxy for measuring competitive positioning in the rapidly evolving field of large language models. The Chatbot Arena, operated by the Large Model Systems Organization (LMSYS Org), has become a widely cited benchmark since its launch in May 2023. Its ranking system is valued because it relies on human preference votes from thousands of users in blind, randomized 'battles' between models, providing a practical assessment of performance as experienced by end-users. Interest in this market stems from the intense commercial and technological race among tech giants and well-funded startups to develop superior AI. The identity of the runner-up company offers insights into market dynamics, investment trends, and which organizations are successfully translating research into competitive products. Tracking this position over time reveals shifts in the competitive landscape that might not be apparent from simply watching the top spot.
The competitive benchmarking of AI models through human evaluation began gaining structure with the launch of the Chatbot Arena by LMSYS Org in May 2023. Prior to this, model comparisons relied heavily on academic datasets like MMLU or HellaSwag, which measure specific knowledge or reasoning tasks but not conversational fluency. The Arena introduced a live, crowdsourced system where users vote on which of two anonymized models gives a better response to their prompt. This method quickly became popular because it reflected real-world usability. In the Arena's early months, OpenAI's GPT-4 dominated the top position. The competition for second place was volatile, shifting between models from Anthropic (Claude), Google (then via Bard/PaLM), and open-source offerings. A significant shift occurred in late 2023 and early 2024 with the release of Claude 3 and Gemini Ultra, which challenged GPT-4's dominance and reshuffled the rankings. By mid-2024, the leaderboard regularly showed a tight cluster of top models separated by small score margins, making the ranking order, especially for second place, highly sensitive to incremental model updates. The historical precedent shows that no company except OpenAI has maintained the top spot for a full year, but several have held the second position for months at a time, indicating a tier of close competitors.
The ranking of AI models has substantial economic consequences. Companies with highly-ranked models attract more developer interest, secure better partnership deals, and can command premium pricing for API access. For instance, a model consistently ranking in the top two can justify enterprise contracts worth hundreds of millions of dollars. The race also influences talent acquisition, as top AI researchers are drawn to organizations demonstrating leading-edge capabilities. Beyond commerce, the competitive landscape shapes the technological tools available to the public and businesses. The model in second place often becomes the primary alternative or benchmark for the leader, driving innovation through competition. Its capabilities set the standard for what is considered commercially viable state-of-the-art AI in applications ranging from coding assistants to customer service agents. The identity of the runner-up company signals where investors, policymakers, and industry observers should focus attention, as it may indicate an emerging challenger or a stalwart maintaining relevance.
As of late 2024, the Chatbot Arena leaderboard under 'Style Control On' is typically led by a model from OpenAI, often GPT-4o or a more recent iteration. The second position has been contested by Anthropic's Claude 3.5 Sonnet and Google's Gemini models. xAI's Grok-2 and Meta's Llama 3.1 405B have also appeared in the top five. The scores are dynamic, changing weekly as new votes are cast and new model versions are submitted for evaluation. The most recent developments involve companies releasing more frequent, incremental updates to their models, each aiming to gain a few Elo points and climb the rankings.
The Chatbot Arena LLM Leaderboard is a public benchmark created by LMSYS Org that ranks large language models based on anonymous, crowdsourced human evaluations. Users engage in blind conversations with two models and vote for the better response. These votes are used to calculate an Elo-style rating called the Arena Score for each model.
'Style Control On' is a filter on the Chatbot Arena that instructs voters to prioritize the factual correctness and helpfulness of a model's response over its stylistic qualities like verbosity or creativity. This setting aims to produce rankings that reflect a model's utility for information-seeking tasks, which is the default and most commonly referenced view of the leaderboard.
The market resolves to the company because specific model names and versions change frequently, sometimes monthly. A company like OpenAI may have multiple models (GPT-4, GPT-4o, o1) on the leaderboard simultaneously. The company identifier provides a more stable resolution target that captures which organization is fielding a top-tier model, reflecting competitive standing rather than a transient product name.
The leaderboard is updated continuously as new votes are processed, but the public table typically refreshes every few days. The scores are dynamic, so the ranking at any given hour can differ slightly. The market uses a specific snapshot taken on February 28, 2026, at 12:00 PM ET for resolution.
The market rules state models are ranked primarily by their Arena Score. In the event of a tie, the leaderboard's built-in tie-breaking mechanisms would be used. Historically, LMSYS Org has used additional statistical metrics or the number of votes to separate tied models, ensuring a clear ordinal ranking for the public list.
Educational content is AI-generated and sourced from Wikipedia. It should not be considered financial advice.
11 markets tracked

No data available
| Market | Platform | Price |
|---|---|---|
![]() | Poly | 81% |
![]() | Poly | 5% |
![]() | Poly | 2% |
![]() | Poly | 2% |
![]() | Poly | 1% |
![]() | Poly | 0% |
![]() | Poly | 0% |
![]() | Poly | 0% |
![]() | Poly | 0% |
![]() | Poly | 0% |
![]() | Poly | 0% |





No related news found
Add this market to your website
<iframe src="https://predictpedia.com/embed/yUmdWE" width="400" height="160" frameborder="0" style="border-radius: 8px; max-width: 100%;" title="Which company has the #2 AI model end of February? (Style Control On)"></iframe>