
$175.60K
1
11

$175.60K
1
11
Trader mode: Actionable analysis for identifying opportunities and edge
This market will resolve according to the company that owns the model with the third-highest arena score based on the Chatbot Arena LLM Leaderboard (https://lmarena.ai/) when the table under the "Leaderboard" tab is checked on February 28, 2026, 12:00 PM ET. Results from the "Arena Score" section on the Leaderboard tab of https://lmarena.ai/leaderboard/text with the style control off will be used to resolve this market. Models will be ranked primarily by their arena score at this market’s chec
Prediction markets currently give Google an 85% chance of having the third-best AI model by the end of February 2026. This means traders see it as very likely, roughly a 5 in 6 chance. The forecast is based on rankings from the Chatbot Arena, a public leaderboard where models are evaluated by thousands of user votes in head-to-head comparisons. Being in the top three is a significant marker of perceived capability and user preference in a crowded field.
The high confidence in Google stems from its consistent performance and recent momentum. Google’s Gemini models have been strong contenders on the Arena leaderboard for over a year, often trading the number two or three spot with Anthropic’s Claude. The company has the resources and incentive to aggressively improve its models to compete with OpenAI, which is widely expected to hold the top position.
Another factor is the structure of the competition. While OpenAI’s GPT models and Anthropic’s Claude are seen as the most likely candidates for first and second place, the race for third is more open. Other contenders, like Meta’s Llama or startups such as xAI, have not yet shown the consistent Arena scores to challenge Google’s position. Traders are betting that Google can maintain this narrow lead.
The market resolves on February 28, 2026, based on a snapshot of the Arena leaderboard at noon Eastern Time. Any major model releases or updates before that date could shift the rankings. Watch for announcements from Google, Anthropic, or Meta about new model versions in late January or early February 2026. Significant performance jumps from competitors could change the odds, but the short timeline makes a major shakeup less probable.
Prediction markets have a solid track record for forecasting outcomes based on clear, publicly tracked metrics like leaderboard scores. The Chatbot Arena provides continuous, transparent data, which makes these markets less speculative than those based on private decisions or judging panels. However, the reliability has limits. A last-minute model release or an unexpected technical issue could alter the final ranking. The 85% probability is not a guarantee, but it reflects a strong consensus based on observable trends.
Prediction markets assign an 85% probability that Google will own the third-best AI model by the end of February 2026. This price, trading at 85¢ on Polymarket, indicates extreme confidence in this outcome. With only six days until resolution, the market views this as nearly certain. The high volume of $176,000 across related contracts confirms serious trader engagement on this specific question.
The current leaderboard snapshot is the primary driver. As of late February 2026, Google's Gemini Ultra model consistently holds third place on the Chatbot Arena leaderboard, behind OpenAI's o1 and Claude 3.5 Sonnet. The arena score gap between Gemini Ultra and the fourth-place model is significant, creating a high barrier for any competitor to close in one week. Historical data shows leaderboard movement at this tier is slow, typically requiring a major new model release to disrupt the order. No such release from companies like Meta or xAI is scheduled before the February 28 checkpoint.
A sudden, unannounced model release from a competitor like Anthropic or Meta before the deadline is the only plausible scenario that could shift rankings. This is considered a low-probability event given standard industry rollout cycles. The more realistic risk is a technical adjustment to the Arena's scoring methodology before resolution, though the market rules are designed to lock in the snapshot from the specified time. Traders betting against the 85% consensus are essentially wagering on an unforeseen and immediate technological leap from another lab, a bet the market currently prices at just a 15% chance.
AI-generated analysis based on market data. Not financial advice.
This prediction market focuses on identifying which company will possess the third-ranked artificial intelligence model by the end of February 2026, as determined by the Chatbot Arena LLM Leaderboard. The Chatbot Arena, operated by the Large Model Systems Organization (LMSYS Org), is a crowd-sourced platform where users vote on the outputs of anonymous AI models in head-to-head conversations. The resulting 'Arena Score' is an Elo-style rating that reflects real-world user preference, making it a widely cited benchmark for conversational AI capability. The market specifically resolves based on the leaderboard snapshot taken on February 28, 2026, at 12:00 PM Eastern Time. The question of ranking is significant because the AI industry is intensely competitive, with companies investing billions to develop models that can outperform rivals in tasks like reasoning, coding, and creative writing. Securing a top-three position on a respected, independent leaderboard like the Chatbot Arena confers substantial prestige, influences developer adoption, and can affect market valuations. Interest in this market stems from the rapid pace of advancement in large language models. The current leaderboard, as of late 2024, is dominated by models from OpenAI, Anthropic, and Google, but the landscape is fluid. New entrants from well-funded startups or other tech giants could disrupt the established order within the 15-month timeframe of this prediction. Traders are effectively betting on which organization will successfully execute its research and development roadmap to climb into an elite tier of AI performance.
The Chatbot Arena leaderboard was launched in May 2023 by LMSYS Org as a response to the limitations of static, automated benchmarks for evaluating LLMs. Traditional benchmarks like MMLU or GSM8K could be gamed through overfitting, and they failed to capture nuanced aspects of conversational ability. The Arena introduced a crowdsourced, blind evaluation system where models are pitted against each other anonymously, and users vote for the better response. This method produced an Elo rating, similar to chess rankings, which reflected real user preference. In its first year, the leaderboard saw a clear stratification. OpenAI's GPT-4 held the top position for an extended period after its March 2023 release. Anthropic's Claude 3 family, released in March 2024, challenged this dominance, with Claude 3 Opus briefly taking the top spot. This shift demonstrated that the top rank was contestable. Other models, like Google's Gemini Pro and Ultra, Meta's Llama series, and Mistral AI's models, consistently populated the ranks just below the very top, creating a highly competitive second tier. The historical volatility among positions 3 through 10 has been greater than at the very peak, making the 'third best' slot a particularly dynamic and unpredictable battleground for companies.
The ranking of AI models has tangible economic and strategic consequences. For the winning company, a top-three Arena score validates its technology roadmap, attracting enterprise customers, top AI research talent, and further investment. It can strengthen a company's position in negotiations with cloud partners and influence developer mindshare, as builders often choose APIs for new projects based on perceived state-of-the-art performance. For the broader tech industry, the intense competition for leaderboard positions drives rapid innovation and massive capital expenditure on computing infrastructure. However, it also raises concerns about a potential 'benchmark race' where companies might optimize for leaderboard performance at the expense of other critical factors like cost efficiency, energy consumption, or safety alignment. The outcome influences which AI architectures and training methodologies receive the most attention and funding, shaping the foundational technology that will be integrated into countless applications, from search engines and office software to scientific research tools and creative aids.
As of December 2024, the Chatbot Arena leaderboard is led by Anthropic's Claude 3 Opus and OpenAI's GPT-4 Turbo, which are separated by only a few Elo points. The third position is highly contested, with models like Google's Gemini Pro and several variants of GPT-4 occupying the spot. The recent trend shows a clustering of scores among the top five models, suggesting that incremental improvements could lead to frequent rank changes. Several companies, including xAI and Meta, have announced intentions to train significantly larger models for release in 2025, setting the stage for potential leaderboard upheaval before the February 2026 resolution date.
The score uses a modified version of the Elo rating system from chess. Each model starts with a baseline rating. When two anonymous models are compared by a user vote, the winner gains Elo points and the loser loses points, with the amount transferred based on the expected probability of winning derived from their current ratings.
Gaming is difficult due to the blind, randomized nature of the battles and the scale of human voting. However, companies can optimize their models for the types of conversational tasks presented in the Arena. LMSYS Org monitors for anomalies and may adjust methodologies to preserve integrity.
The Chatbot Arena leaderboard uses the Elo rating to several decimal points for ranking. In the extremely unlikely event of a perfect tie, the market resolution would follow the tie-breaking procedure defined in the market's official description, which typically relies on the order listed on the lmarena.ai leaderboard table.
No, inclusion requires submission to and integration by LMSYS Org. Most major proprietary and notable open-source models participate, but some private or specialized models may not be listed. The leaderboard represents a significant, but not exhaustive, sample of the field.
The Arena is valued for its focus on holistic, human-evaluated conversational quality, which correlates well with real-world usability. Automated benchmarks often measure narrow skills and can be overfitted, while the Arena's dynamic, preference-based scoring is harder to manipulate directly.
Educational content is AI-generated and sourced from Wikipedia. It should not be considered financial advice.
11 markets tracked

No data available
| Market | Platform | Price |
|---|---|---|
![]() | Poly | 85% |
![]() | Poly | 10% |
![]() | Poly | 3% |
![]() | Poly | 2% |
![]() | Poly | 2% |
![]() | Poly | 0% |
![]() | Poly | 0% |
![]() | Poly | 0% |
![]() | Poly | 0% |
![]() | Poly | 0% |
![]() | Poly | 0% |





No related news found
Add this market to your website
<iframe src="https://predictpedia.com/embed/mRiY-A" width="400" height="160" frameborder="0" style="border-radius: 8px; max-width: 100%;" title="Which company has the third best AI model end of February?"></iframe>