GPT Image 2: What the LMArena Leak and ChatGPT A/B Test Actually Show

Update · April 22, 2026 — OpenAI has officially released GPT Image 2 as ChatGPT Images 2.0. It's live on VidCella now — try it on the GPT Image 2 feature page with resolution-tiered pricing from 5 credits per image (1K) up to 4K output, no ChatGPT subscription required. The rest of this post was written before release, when the model existed only as the LMArena leak and the ChatGPT A/B test described below.
Over the last two weeks, "GPT Image 2" (also written as GPT-Image-2 or, informally, GPT Image 2.0) has become one of the most persistent rumours in the AI space. X threads, Reddit posts, and a growing pile of secondary coverage all point to the same idea: OpenAI has a new image model, it's dramatically better at text rendering and photorealism, and ChatGPT is quietly serving it to a slice of users. Some of that is grounded in real evidence. Some of it is extrapolation from screenshots. The two are worth keeping apart.
This post is an attempt to do that honestly. As of April 18, 2026, OpenAI hasn't officially announced a model called GPT Image 2. The official API model catalog still lists gpt-image-1.5 as the newest named image model, alongside the generic chatgpt-image-latest alias that tracks whatever ChatGPT happens to be using internally. What has happened is something more interesting and more ambiguous: an LMArena leak in early April, and a visible A/B test inside ChatGPT shortly after. The rest of this article walks through what those mean, what people are claiming, and how the whole picture compares to Google's Nano Banana Pro, which is fully public and fully documented today.
What is GPT Image 2?
GPT Image 2 is the community name for an unreleased OpenAI image model that briefly appeared on LMArena in early April 2026 under three codenames — packingtape-alpha, maskingtape-alpha, and gaffertape-alpha — before being pulled. OpenAI hasn't officially announced it. What does exist publicly is a gradual ChatGPT A/B test serving what appears to be the same model.
On roughly April 4, 2026, three unfamiliar image models appeared on LMArena — the public blind-test arena where users compare outputs from anonymized models. Their codenames were tape-themed: packingtape-alpha, maskingtape-alpha, and gaffertape-alpha. They weren't labelled with any provider, and they didn't stay up for long. Within a few days they had been pulled from the rotation, but not before a substantial number of community testers ran prompts through them and posted the results.
The pattern that emerged was consistent. In side-by-side matchups the tape-alpha models were handling in-image text with unusual accuracy. Full sentences and paragraphs rendered cleanly, non-English scripts held their shape, and layout work like menus, posters, and fake app screenshots came out visibly crisper than anything else on the board. Photorealism was at or above Nano Banana Pro. World-knowledge prompts (asking for accurate storefronts, real packaging, correct UI chrome for a well-known website) came back with fewer of the small hallucinations that give generated images away.
None of that officially names OpenAI. The LMArena convention is that providers submit models anonymously, and the platform has never confirmed which lab was behind the tape-alpha entries. The attribution to OpenAI is community inference, based on the cadence of OpenAI's previous unlabelled appearances, the feel of the outputs relative to gpt-image-1.5, and the fact that the timing lined up with a separate ChatGPT A/B test that started almost immediately afterward. A reasonable reading is "very probably OpenAI," not "confirmed OpenAI." For a deeper breakdown of the codenames and the specific prompts that circulated, Apiyi's grayscale interpretation has the most thorough catalogue.
Is GPT Image 2 already in ChatGPT?
Yes, partially. Since early April 2026, OpenAI has been silently A/B testing what appears to be GPT Image 2 inside ChatGPT — some users see a comparison panel asking them to pick the better of two generated images, others get a quietly upgraded model with no UI change at all.
Around the same window as the LMArena leak, people generating images inside ChatGPT started noticing two different kinds of change. The first is a silent swap: a portion of users report that their image outputs just got visibly better overnight, particularly on text rendering and photorealism, with no announcement and no new UI. The second is more explicit. Some users see a comparison panel inviting them to pick the better of two generated images, which is the classic shape of a human-preference A/B experiment.
Both variants are consistent with OpenAI running a graduated rollout of a new model behind the existing ChatGPT image UI, using the comparison picker on one segment to gather preference data and serving the new model directly to another segment to observe behaviour at scale. This is the same playbook OpenAI has used before major model launches.
There's a popular claim that only Plus subscribers see the A/B test, and the evidence for it is weaker than it looks. Reports have come from users across ChatGPT's paid tiers (the $20/month Plus plan, the older $200/month Pro plan, and the new $100/month Pro tier that OpenAI introduced in early April). The reason paid users dominate the reports is straightforward: paid users generate more images per day than free users and are therefore more likely to notice a silent change. That isn't the same as a policy gate. Whether free users are bucketed in at lower rates or excluded outright isn't something OpenAI has said publicly. If you're on a free plan and you care, the honest answer is that you might be eligible and simply undersampled. This pattern, where a provider silently gates an AI feature by paid tier, is one we've covered before with X's @grok restrictions; the underlying dynamics are similar.
What can GPT Image 2 actually do?
Based on community testing of the leaked model, GPT Image 2's reported strengths are near-perfect in-image text rendering, far better non-English typography, photorealistic UI screenshots, stronger world knowledge, and 4K-class native resolution without the GPT-Image yellow cast. None of these are confirmed by OpenAI.
If you collapse the social media chatter into a rough consensus, five capability claims keep coming up. Each of these is community-reported from the tape-alpha leak and the A/B outputs, not confirmed by OpenAI, and you should read them accordingly.
The first is near-perfect in-image text rendering. The widely repeated number is 99%+ accuracy on full sentences. The number itself isn't from a benchmark anyone can audit; it's a community estimate from comparing outputs. But the qualitative improvement is real enough that a lot of testers put it first in their writeups.
The second is much better non-English typography. Chinese, Korean, Japanese, and Cyrillic scripts in particular have been a weak spot for every major image model, and the tape-alpha outputs reportedly handle them far more reliably, including multi-character labels and mixed-language posters.
The third is photorealistic UI and website screenshots. Asking a current generation model for a realistic screenshot of, say, a news site or a SaaS dashboard typically produces something that reads as "AI-generated" immediately — fonts drift, chrome is off, icons are subtly wrong. The leaked model is said to produce UI mockups that genuinely look like screenshots, which is a meaningful new capability if it holds up.
The fourth is stronger world knowledge. Storefronts of real chains, accurate product packaging, correct architectural details on well-known buildings. The kind of thing where a wrong logo or mislabelled bottle used to give the generation away immediately.
The fifth is higher native resolution, reportedly 4K-class, together with the elimination of the faint yellow-orange colour cast that has been a recognisable GPT-Image fingerprint since the original release. Whether this is a new output pipeline or just a retrained decoder, no one outside OpenAI can say.
GPT Image 2 vs Nano Banana Pro: which is better?
On the leaked benchmarks, GPT Image 2 takes a clear lead on photorealism and in-image text rendering, while Nano Banana Pro keeps its advantage in Google Search grounding and structured-output workflows for branded design. Neither is strictly better — they're optimised for different jobs.
The discussion gets muddled online because there are three products in the room, not two, and they're easy to conflate.
The first is OpenAI's currently shipping image model: gpt-image-1.5, available through the API and ChatGPT. OpenAI's own material positions it as strong on prompt adherence and faithful edits, preserving faces, logos, and key visual elements when you ask it to modify an image. On public blind leaderboards, gpt-image-1.5 in its "high" tier has been sitting slightly ahead of Nano Banana Pro on both text-to-image and image-editing tracks. The OpenAI launch post for the new ChatGPT Images covers the 1.5-era capabilities in detail.
The second is gemini-3-pro-image-preview, the developer name for Nano Banana Pro. Google has been unusually specific about what this model is for: complex graphic design, high-fidelity product mockups, factual data visualisation, clean text rendering including multi-language localisation, and (uniquely among current image models) Google Search grounding that lets the generation step pull in real-world references to reduce factual hallucination. Google DeepMind's product page is more or less a manual for where it's supposed to win: posters, charts, marketing assets, branded material, anything where getting a fact wrong is expensive.
The third is the rumoured OpenAI model behind the tape-alpha leak and the A/B test. Its reported strengths are text rendering, UI and website photorealism, world knowledge, and non-English typography, which is almost exactly the ground Nano Banana Pro has been standing on. If the rumours hold up in a shipped product, the comparison gets much closer, and OpenAI probably takes a clear lead on pure photorealism and in-image text. What Nano Banana Pro would keep is the structured-outputs plus Search-grounded workflow, which the rumoured OpenAI model hasn't been claimed to include.
The statement "GPT Image 2 is strictly better than Nano Banana Pro" isn't something the current evidence supports. A safer version is this: the rumoured OpenAI model is pushing harder on the dimensions OpenAI was already strong in, and if it ships in the form the leaks suggest, it narrows or reverses Google's lead on typography and realism while leaving Google's grounding workflow intact.
How to test if you have GPT Image 2 access
Ask ChatGPT for an image with two full sentences of paragraph text, or a mixed-language poster combining Latin and non-Latin scripts. If the text renders cleanly end-to-end with consistent glyphs, you're probably bucketed into the new model.
If you want to check whether your ChatGPT account is in the new bucket, the tell-tale signs are specific. Ask for an image with a block of paragraph text in it: not a single word, a full couple of sentences. If the text is legible end to end with correct punctuation and consistent glyph shapes, you're probably on the new model. Non-English text is another strong signal: try a mixed-language poster with, for example, both English and Chinese characters and see whether the non-Latin characters hold together instead of turning into decorative noise. Ask for a screenshot of a fictional SaaS dashboard. The old model tends to produce something that reads as synthetic almost immediately; the leaked model produces something closer to a real capture.
Two operational notes. First, refreshing the conversation or starting a new chat sometimes appears to reshuffle the bucket, so if you get an output that feels like the old model and you want to check whether that's consistent, a fresh chat is a cheap thing to try. Second, the A/B doesn't appear to bucket by plan. Upgrading specifically to get access isn't a confirmed path. People on all three paid tiers have reported being both in and out of the new bucket, and free users appear to be undersampled rather than strictly excluded. If you'd rather skip the lottery entirely, no-subscription, pay-as-you-go image generation on a different provider is a cheaper way to experiment than upgrading ChatGPT on the hope of getting bucketed in.
When will GPT Image 2 be released?
OpenAI hasn't given a date. Based on its historical 2–4 week cadence from LMArena leak to public release, the most likely launch window is late April through mid-May 2026, before the May 12 DALL·E retirement deadline forces a replacement story.
Two concrete milestones sit on the near horizon. OpenAI has already announced that DALL·E will be retired on May 12, 2026, which is a hard internal deadline for having a replacement image story in place one way or another. Either a named gpt-image-2 appears in the public API docs in that window, or OpenAI quietly rolls the new model into chatgpt-image-latest and lets it inherit the capabilities without a new name.
Don't want to wait on the OpenAI lottery? You can already run gpt-image-1.5, Nano Banana Pro, and Seedream side-by-side on Vidcella's image models without a ChatGPT subscription — pay-as-you-go, no waitlist, switch models per generation.
Try GPT Image 2 Now That It's Live
VidCella hosts OpenAI's GPT Image 2 directly — multilingual text rendering, up to 16 reference images for editing, 1K / 2K / 4K output from 5 credits per generation, pay-as-you-go.
From 5 credits · 1K / 2K / 4K output · No ChatGPT Plus required · Pay-as-you-go
