
$1.12M
1
10

$1.12M
1
10
Trader mode: Actionable analysis for identifying opportunities and edge
This market will resolve according to the company which owns the model which has the highest arena score based off the Chatbot Arena LLM Leaderboard (https://lmarena.ai/) when the table under the "Leaderboard" tab is checked on June 30, 2026, 12:00 PM ET. Results from the "Arena Score" section on the Leaderboard tab of https://lmarena.ai/leaderboard/text with the style control off will be used to resolve this market. If two models are tied for the top arena score at this market's check time, r
Prediction markets currently give Anthropic, the company behind Claude, roughly a 1 in 3 chance of having the top-ranked AI model at the end of June 2026. This means traders collectively see it as possible but not the most likely outcome. The leading position is actually held by the "Other" category, which implies the market sees a strong chance that a different company, not currently listed as a major option, could take the top spot. With over a million dollars wagered, there is significant interest but also clear uncertainty about who will lead in two years.
The current odds reflect the fast-moving and unpredictable nature of AI development. First, the timeline is long. Two years is an eternity in this field, allowing for multiple major research breakthroughs from established players or new entrants. Second, the benchmark being used, the Chatbot Arena, ranks models based on human preferences in blind tests. This is different from technical benchmarks and can be swayed by factors like user interface and perceived "helpfulness," which companies can improve rapidly. Finally, while companies like OpenAI (with GPT models) and Google (with Gemini) are strong contenders, history shows that leadership in tech can shift quickly with a single innovative release.
There are no specific scheduled dates, but the market will react to model releases and leaderboard updates. Major AI conferences, like those held by NeurIPS or CVPR, often serve as launchpads for new models. Any announcement from a major company about a "next-generation" model will shift probabilities. The most important signals will be the periodic updates to the Chatbot Arena leaderboard itself throughout 2025 and early 2026, as they provide direct evidence of which models are gaining user preference.
Prediction markets have a mixed record on long-term technology questions. They are often good at aggregating known information about current capabilities and near-term roadmaps. For an event two years away, however, they are essentially forecasting the pace of innovation, which is inherently difficult. The high volume of bets suggests the "wisdom of the crowd" is engaged, but the leading bet being on "Other" is a clear admission that the future is wide open. These markets are better seen as a snapshot of current informed sentiment than a sure prediction.
Prediction markets assign a low 35% probability to Anthropic having the top-ranked AI model on the Chatbot Arena leaderboard by June 30, 2026. This price indicates traders view Anthropic as a serious contender, but not the favorite. The market is highly liquid with over $1.1 million in total volume spread across ten related contracts, suggesting strong institutional and retail interest in this long-term benchmark for AI capability. The current pricing implies the consensus expects a different company, likely OpenAI, to retain its competitive edge.
The 35% probability for Anthropic reflects its status as OpenAI's primary competitor. Claude 3.5 Sonnet's strong performance in mid-2024 briefly challenged GPT-4's lead, demonstrating Anthropic's capacity for rapid iteration. However, historical patterns favor incumbents in AI races due to scale advantages in compute, data, and revenue. OpenAI's consistent delivery of state-of-the-art models, combined with its vast partnership and distribution network with Microsoft, creates a high barrier. Traders are pricing in Anthropic's proven ability to compete at the frontier, but also the significant resources and execution required to unseat a leader that has maintained pole position for over three years.
The 122-day timeline to resolution allows for major shifts. A surprise model release from Anthropic, Google, or a dark horse like xAI before the deadline could immediately reset probabilities. The most likely catalyst is OpenAI's expected release of a next-generation model, possibly GPT-5, which would likely cement its lead and crush Anthropic's odds. Conversely, if OpenAI's next release is incremental or delayed, Anthropic's probability would rise sharply. The market will also react to any leaks or official performance benchmarks published in the coming months. The Chatbot Arena leaderboard itself can move prices if weekly updates show a sustained change in the ranking between Claude and GPT models.
This market is trading exclusively on Polymarket, which has established itself as the dominant platform for long-tail, event-driven contracts like technology competitions. The high liquidity here, compared to typical political markets on other platforms, shows specialized interest. The absence of a comparable contract on Kalshi suggests this niche topic falls outside its regulated, U.S.-focused event catalog. Polymarket's global access and crypto-native users are likely driving the deep liquidity, as traders use this contract to hedge technical bets or express long-term views on AI development timelines.
AI-generated analysis based on market data. Not financial advice.
This prediction market focuses on determining which company will possess the artificial intelligence model with the highest performance score by the end of June 2026. The resolution is based on a specific, publicly available benchmark: the Chatbot Arena LLM Leaderboard hosted at lmarena.ai. This leaderboard uses a crowdsourced, competitive evaluation method where users vote on the outputs of anonymous AI models in head-to-head conversations. The model with the highest 'Arena Score' at 12:00 PM Eastern Time on June 30, 2026, will determine the winning company. The AI model landscape is characterized by rapid, public competition among technology giants and well-funded startups. Companies like OpenAI, Anthropic, Google, and Meta regularly release new models, each claiming superior capabilities in reasoning, coding, and general knowledge. The Chatbot Arena has become a widely cited reference point because it aggregates thousands of human preferences, offering a different perspective from traditional academic benchmarks. Interest in this market stems from the high-stakes race for AI supremacy, which has significant implications for product development, investor sentiment, and strategic positioning within the tech industry. Tracking which company leads in public evaluations provides a tangible metric in a field often dominated by technical claims and marketing.
The competitive benchmarking of AI models intensified with the November 2022 release of OpenAI's ChatGPT, which demonstrated the public utility of large language models. Prior to this, model comparisons were largely confined to academic datasets like GLUE or SuperGLUE. In May 2023, LMSYS Org launched the Chatbot Arena, introducing a live, crowdsourced evaluation platform that reflected real-world user preferences. This shifted the competitive dynamic from private research benchmarks to a public, continuously updated leaderboard. The first models to dominate the arena were OpenAI's GPT-4 and Anthropic's Claude. A significant shift occurred in early 2024 with the release of the Claude 3 model family, where Claude 3 Opus briefly surpassed GPT-4 on the Arena Score for the first time, demonstrating that the lead was contestable. Throughout 2024, Google's Gemini Ultra and Meta's Llama 3 also entered the top tiers. This history shows a pattern of leapfrogging, where a new model release temporarily claims the top spot before being overtaken by a competitor's subsequent release, setting the stage for continued volatility through 2026.
The company that leads in AI model performance captures significant strategic advantages. It can attract the best research talent, command premium pricing for API access, and integrate superior AI into its own products, from search engines to office software. This leadership often translates directly into market valuation, as seen in stock price movements following major AI announcements. For developers and businesses building on these platforms, the leading model becomes the default choice for cutting-edge applications, creating network effects and lock-in. Beyond economics, the outcome influences the philosophical direction of AI development. A win for open-source advocates like Meta could accelerate decentralized innovation, while a win for safety-focused firms like Anthropic might prioritize cautious deployment. The result signals which approach—scale, efficiency, safety, or openness—is currently yielding the most capable systems according to human evaluators.
As of mid-2024, the top of the Chatbot Arena leaderboard is highly contested. Claude 3 Opus, GPT-4 Turbo, and Gemini Ultra 1.0 are within a few points of each other, with the rank order fluctuating slightly as more votes are collected. Meta's Llama 3 70B holds a strong position just below these leaders. All major companies have signaled ongoing development, with OpenAI discussing a next-generation model, Google refining the Gemini series, and Anthropic continuing its iterative releases. The competitive field is active, with no single model holding a decisive, long-term lead.
The Arena uses an Elo rating system, similar to chess. Two anonymous models generate responses to the same user prompt. The user votes for which response is better. A win increases a model's Elo score, while a loss decreases it. The 'Arena Score' is this calculated Elo rating.
Traditional benchmarks like MMLU or GSM8K test specific knowledge or skills through automated grading. The Arena Score is based on thousands of subjective human preferences across open-ended conversations, measuring which model's outputs users find more helpful, honest, and harmless in practice.
Gaming is difficult by design. Models are anonymized during voting, so users do not know which company's model they are evaluating. LMSYS also employs detection methods for suspicious voting patterns. The scale of voting, over a million votes, further dilutes any targeted manipulation attempts.
This date provides a medium-term horizon that allows for multiple development cycles from competing companies. It is far enough for significant architectural advances but near enough for current trends and roadmaps to remain relevant for prediction.
Prediction market operators typically use archived snapshots or secondary sources in such cases. The market rules specify the exact URL and table to check, and resolvers would look for the most recent reliable data from that source preceding the deadline.
Educational content is AI-generated and sourced from Wikipedia. It should not be considered financial advice.
10 markets tracked

No data available
| Market | Platform | Price |
|---|---|---|
![]() | Poly | 36% |
![]() | Poly | 34% |
![]() | Poly | 14% |
![]() | Poly | 10% |
![]() | Poly | 6% |
![]() | Poly | 1% |
![]() | Poly | 1% |
![]() | Poly | 1% |
![]() | Poly | 1% |
![]() | Poly | 0% |





No related news found
Add this market to your website
<iframe src="https://predictpedia.com/embed/HMqyns" width="400" height="160" frameborder="0" style="border-radius: 8px; max-width: 100%;" title="Which company has best AI model end of June?"></iframe>