
$590.45K
1
10

$590.45K
1
10
Trader mode: Actionable analysis for identifying opportunities and edge
This market will resolve according to the company that owns the model with the highest arena score based on the Chatbot Arena LLM Leaderboard when the table under the "Leaderboard" tab is checked on February 28, 2026, 12:00 PM ET. Results from the "Arena Score" section on the Leaderboard tab of https://lmarena.ai/leaderboard/text set to default (style control on) will be used to resolve this market. If two models are tied for the best arena score at this market's check time, resolution will be
Prediction markets currently give Anthropic an 85% chance of having the top-ranked AI model by the end of February. This means traders collectively see it as very likely, roughly a 6 in 7 chance, that an Anthropic model like Claude will be number one on a specific public benchmark in one week.
The benchmark is the "Chatbot Arena" leaderboard, a crowdsourced website where thousands of users vote on which AI chatbot gives better answers in blind tests. The top spot is a visible, if imperfect, signal of current user preference and model capability.
Two main factors are driving the high confidence in Anthropic. First, the company's Claude 3.5 Sonnet model has held the number one position on this leaderboard for most of the last six months. It established a strong lead over competitors like OpenAI's GPT-4 and various open-source models.
Second, there is no major, scheduled model release from a key competitor publicly expected before the February 28th deadline. OpenAI sometimes releases updates without warning, but markets are betting that no such surprise in the next seven days will be enough to dethrone Claude's current lead. The prediction reflects the stability of the competitive landscape over recent months more than a bold forecast of a new breakthrough.
The only firm date is the resolution date: February 28, 2026, at 12:00 PM ET, when the market will check the leaderboard. Any shift in the odds before then would likely come from an unexpected announcement or leak about a new model release, most probably from OpenAI or Google. Traders will be watching for any official blog posts or credible industry rumors suggesting a new model is being launched or evaluated before the cutoff.
For near-term, fact-based questions like this one, prediction markets tend to be fairly accurate. They are good at aggregating known information, like a model's existing leaderboard position. The major limitation here is the potential for a surprise. If a competitor quietly submits a new model to the Arena for testing and it quickly climbs the ranks, the market could be caught off guard. The 85% probability isn't a guarantee, it's the market's odds that no such surprise will happen in the next week.
Prediction markets currently assign an 85% probability that Anthropic will have the top-ranked AI model on the Chatbot Arena leaderboard by February 28, 2026. This price, trading at 85¢ on Polymarket, indicates extreme confidence in Anthropic's near-term lead. With $586,000 in total volume across related markets, liquidity is sufficient for the consensus to be meaningful. An 85% chance is a strong signal; the market views an Anthropic victory as the expected outcome, though not a complete certainty.
The high probability directly reflects Anthropic's current dominance. As of February 2026, Anthropic's Claude 3.5 Sonnet holds the #1 position on the Arena leaderboard. The market is effectively pricing in the stability of this lead over the next week. Historical data shows that once a model secures the top spot, it typically maintains it for a period unless a competitor releases a major upgrade. No such release from OpenAI, Google, or other contenders is scheduled before the February 28 checkpoint. The market is also betting on Anthropic's consistent execution. Their recent model releases have focused on incremental improvements in reasoning and safety, a strategy that appears to solidify their competitive position without introducing destabilizing changes.
The primary risk to the current pricing is an unexpected technical adjustment or leaderboard update from the LMSys Arena before the resolution time. While no competitor announcements are imminent, a sudden change in the Arena's evaluation methodology could theoretically alter scores. A less likely but possible scenario is the discovery of a critical performance flaw in the leading Claude model that LMSys validates before February 28. Given the short 7-day timeframe, however, the window for a disruptive event is narrow. The market's high confidence is a bet that the status quo will hold for one more week.
The Chatbot Arena LLM Leaderboard, run by LMSys, is a key public benchmark for large language models. It uses a crowdsourced, blind-testing "Arena" format where users vote on model outputs, generating an Elo-based ranking. The "Style Control On" setting, which this market uses for resolution, standardizes model outputs to reduce stylistic bias in voting. This leaderboard has significant influence in the AI community, often driving developer adoption and public perception. Anthropic's current lead here reinforces its reputation for building top-tier, usable AI systems.
AI-generated analysis based on market data. Not financial advice.
This prediction market focuses on determining which company will possess the top-ranked artificial intelligence language model at the end of February 2026. The resolution is based on a specific, publicly available benchmark: the Chatbot Arena LLM Leaderboard maintained at lmarena.ai. The market will check the 'Arena Score' for models with 'Style Control On' at 12:00 PM Eastern Time on February 28, 2026. The company whose model has the highest score at that precise moment will be declared the winner. In the event of a tie, the market will resolve to the model with the highest 'Arena Score' from the previous day's data. The Chatbot Arena leaderboard is a widely recognized benchmark in the AI community. It uses a crowdsourced, blind-testing methodology where users vote on the quality of responses from anonymized models. This approach is valued for its real-world, human-evaluated assessment of model performance, contrasting with automated benchmarks that can be more easily optimized. Interest in this market stems from the intense competition and rapid pace of innovation in the generative AI sector. The identity of the leading model has significant implications for corporate prestige, developer adoption, and investor sentiment. Tracking this specific point-in-time snapshot provides a clear, objective measure of a fleeting technological lead in a highly dynamic field.
The competitive benchmarking of AI models gained prominence with the release of OpenAI's GPT-3 in 2020, which demonstrated unprecedented language capabilities. However, the lack of standardized, human-centric evaluation made direct comparisons difficult. In response, the LMSYS Chatbot Arena was launched in May 2023. It introduced a novel 'battle' format where users converse with two anonymized models and vote for the better response, generating an Elo-style ranking. This method quickly became a community standard. The leaderboard has seen several shifts in leadership. OpenAI's GPT-4 dominated the top position from its release in March 2023 for over a year. In early 2024, Anthropic's Claude 3 Opus and Google's Gemini Ultra began to challenge this dominance, sometimes tying or briefly surpassing GPT-4 in the Arena scores. The release of Meta's Llama 3 in April 2024 also marked a significant event, as a powerful open-source model entered the top tier of rankings. These historical fluctuations demonstrate the rapid progress in the field and set the stage for the ongoing competition measured by this 2026 prediction market.
The ranking of AI models has tangible economic and strategic consequences. The company with the top model gains a significant marketing advantage, attracting enterprise customers, developer mindshare, and potentially influencing its stock valuation. For businesses integrating AI, the leading model often becomes the default choice for critical applications, creating a network effect and lock-in for the provider. This competition drives massive investment in computing infrastructure and research, shaping the broader technology landscape. Beyond corporate rivalry, the performance of these models directly impacts how millions of people interact with AI. The leading model sets expectations for capability, safety, and usefulness in applications from education to customer service. The outcome influences which ethical frameworks and safety practices become industry standards, as different companies prioritize these aspects differently. The result of this benchmark could steer the direction of both commercial AI development and public policy discussions around artificial intelligence.
As of late 2024, the Chatbot Arena leaderboard shows a tightly contested top tier. Anthropic's Claude 3 Opus, OpenAI's GPT-4 series, and Google's Gemini models frequently trade positions within a narrow margin of a few Elo points. The introduction of 'Style Control On' as a filter has added a new dimension to scoring, emphasizing consistent tone and instruction following. All major companies have publicly signaled ongoing development of next-generation models, with industry analysts expecting releases from OpenAI (GPT-5), Google (Gemini successors), and others in the lead-up to the 2026 market resolution date. The competitive landscape remains in flux, with no single company holding an unassailable lead.
The Chatbot Arena is a public leaderboard run by LMSYS that ranks AI language models based on anonymous, side-by-side human evaluations. Users chat with two hidden models and vote for the better response, generating an Elo rating for each model that reflects its perceived performance in real conversations.
'Style Control On' is a filter on the Chatbot Arena leaderboard. It means the evaluated models were tested on their ability to follow specific stylistic instructions, like writing in a particular tone or format. This tests a model's controllability and instruction-following precision beyond basic question-answering.
The leaderboard updates continuously as new votes are cast, but the public display is typically refreshed regularly. For the purpose of this prediction market, only the scores visible at the specific checkpoint time of 12:00 PM ET on February 28, 2026, will be used for resolution.
No, changes at the very top have been relatively infrequent but significant. OpenAI's GPT-4 held the top spot for over a year before being challenged by Anthropic's Claude 3 Opus in 2024. The top rank tends to be stable for months, then shift with a major new model release.
Prediction market contracts based on external data typically include contingency rules. In such cases, resolution would likely rely on the most recent cached or archived data from the leaderboard, or be delayed until the site is accessible, as defined in the market's official specifications.
Educational content is AI-generated and sourced from Wikipedia. It should not be considered financial advice.
10 markets tracked

No data available
| Market | Platform | Price |
|---|---|---|
![]() | Poly | 87% |
![]() | Poly | 8% |
![]() | Poly | 2% |
![]() | Poly | 1% |
![]() | Poly | 1% |
![]() | Poly | 0% |
![]() | Poly | 0% |
![]() | Poly | 0% |
![]() | Poly | 0% |
![]() | Poly | 0% |





No related news found
Add this market to your website
<iframe src="https://predictpedia.com/embed/HGVXu-" width="400" height="160" frameborder="0" style="border-radius: 8px; max-width: 100%;" title="Which company has the #1 AI model end of February? (Style Control On)"></iframe>