OpenAI’s latest image model, GPT-Image-2, has secured the top position across all major categories on Image Arena, a widely followed benchmark based on blind human evaluations. The model, which powers ChatGPT Images 2.0, ranked first in Text-to-Image, Single-Image Edit, and Multi-Image Edit tasks, marking one of the most dominant performances recorded on the platform.
Image Arena evaluates models using anonymous user voting, where participants compare outputs without knowing which system generated them. This method is considered one of the more reliable ways to measure real-world performance and user preference in generative AI.
GPT-Image-2 achieved a score of 1,512 in Text-to-Image, outperforming Nano Banana 2 from Google by 242 points. According to Arena, this represents the largest gap ever recorded between first and second place.
Exciting news – GPT-Image-2 by @OpenAI has claimed the #1 spot across all Image Arena leaderboards!
A clean sweep with a record-breaking +242 point lead in Text-to-Image – the largest gap we’ve seen to date.
– #1 Text-to-Image (1512), +242 over #2 (Nano-banana-2 with web-search… https://t.co/YYKjhgjhsn pic.twitter.com/IBN9a1RIJ4
— Arena.ai (@arena) April 21, 2026
Dominance Across Categories
The model’s performance extended beyond a single leaderboard. GPT-Image-2 scored 1,513 in Single-Image Edit and 1,464 in Multi-Image Edit, maintaining significant leads over competing models in each category.
It also ranked first across all seven Text-to-Image subcategories, including art, photorealism, product design, and 3D imagery. Improvements over its predecessor were substantial, with gains ranging from nearly 200 to over 300 points depending on the category.
GPT-Image-2 takes #1 in every single Text-to-Image category — all 7 of them. Surpassing the next leading model (Nano-banana-2 with web-search) across the board.
Here’s the drill-down on improvements vs. its predecessor, GPT-Image-1.5-High-Fidelity:
– #1 Product, Branding &… pic.twitter.com/nsbgiuiAGv
— Arena.ai (@arena) April 21, 2026
One of the most notable advances is in text rendering within images, an area where AI models have historically struggled. GPT-Image-2 showed strong improvements in accurately generating text, including non-Latin scripts such as Japanese, Korean, and Hindi, suggesting deeper underlying improvements rather than incremental tuning.
From Anonymous Testing to Public Release
Before its official launch, versions of the model appeared anonymously on Arena under codenames such as “maskingtape” and “gaffertape.” These early tests generated attention for their strong performance, a strategy similar to how Google tested its own models before release.
The approach allows companies to validate models in real-world conditions before announcing them publicly, using Arena’s ranking system to build credibility.
Intensifying Competition in Image AI
The results highlight intensifying competition in AI image generation. Google’s Nano Banana models previously led the leaderboard and drove significant user growth for its Gemini platform, demonstrating how performance improvements can translate into broader adoption.
GPT-Image-2’s lead suggests OpenAI has regained momentum in this segment, at least in terms of benchmark performance. However, whether these gains will translate into widespread user adoption remains uncertain.
For developers and enterprises choosing AI tools, the results provide a strong signal of capability. A large margin in blind human evaluations indicates that GPT-Image-2 may offer more consistent and preferred outputs in practical use cases, from design and marketing to content creation.