Post: How ChatGPT’s New Image Generator Stacks Up Against Gemini’s Nano Banana Pro

How ChatGPT’s New Image Generator Stacks Up Against Gemini’s Nano Banana Pro


After the major image editing upgrade added to Google Gemini in August — under the whimsical code name Nano Banana — it’s your turn to supercharge the tools you get for image manipulation within chat. The new update is called GPT Image 1.5, and is Rolling now for all users.

A major improvement here, as was the case with Nano Banana, is the way in which ChatGPT can now modify a specific part of an image while keeping everything else constant. You can add or remove an object, or change the color or style of an object, without ending up with a completely different looking image.

Another feature ChatGPT has now borrowed from Gemini: the ability to combine multiple images into one scene. Do you and your best friend want to be in front of the Sydney Harbor Bridge? No problem – just supply the source images and the AI ​​will do the rest. You can also change the visual style while keeping the details constant.

Openei says the new image editor and generator is able to execute “more reliable” instructions and render images four times faster than before. Text can vary more in style and size, and images should generally be more realistic and error-free—though even Openai admits there’s still room for improvement.

It’s the best image generator tool we’ve ever seen in ChatGPT, and it all looks impressive at first glance – but how does it stack up against Gemini and Nano Banana in practice? I put both models to the test on both platforms (which are Chat GPT Plus and Google AI Pro respectively) with a 20% per month plan to see how they compared.

Rendering and editing images

Open ChatGupt on the web or on mobile and you’ll notice a new Images tab on the left-hand navigation pane. This takes you to your current photo library, along with some new tips for creating photos. You get some suggestions for tips, as well as a variety of preset portrait image styles you can apply.

Gemini Images

A journalist, a lamp, and a countryside scene courtesy of Gemini.
Credit: Gemini

Images of chat gpt

A journalist, lamp, and countryside scene courtesy of Chat.
Credit: Chet Gpt

I tested the new GPT Image 1.5 model by getting Chat GPT to produce a cartoon-style rolling landscape of a busy tech journalist, a lamp in the middle of an empty warehouse, and hills in fog. I then got the Gemini to make the same photos with the same pointer. Although the results were very different, they were fairly equal in terms of quality and realism.

Both ChatGPT and Gemini are now quite competent in clean image edits: both AI bots seamlessly changed the journalist’s outfit into a shirt and tie without touching any other part of the photo. This would take a lot of time to do manually, even by a Photoshop expert, and shows how AI imaging is changing.

Color changes were all handled with aplomb, but the AIS struggled a bit with perspective changes, where I asked to see the same shot from another angle. In these cases, the directions were less well followed and the images less consistent (because new areas needed to be rendered), although ChatGPT did a little better than Gemini in getting good results.

Gemini Images

Second (Gemini Edition) can now change clothing.
Credit: Gemini

Images of chat gpt

Clothing can now be changed in seconds (Chat GPT Edition).
Credit: Chet Gpt

The classic “remove an object from this picture” challenge was handled with aplomb: both Gemini and Chat GPT managed to remove a cottage from the surgical precision countryside, leaving everything else intact. Again, these are the kind of timed image edits that used to require a lot of careful effort, and can now be done in seconds.

What do you think so far?

Gemini Images

Gemini’s attempt to remove the cottage.
Credit: Gemini

Images of chat gpt

An attempt to remove Chetgupat’s cottage.
Credit: Chet Gpt

Combining and remixing images

Another talent Chat GPT and Gemini is now being able to link images together. So you can have separate photos of you and your parents, place them in the same shot, and then add them to the background wherever you want. You can get the perfect family photos without gathering your relatives or going anywhere.

This was one area where Gemini and Chatgut struggled a bit more: The editing skills were still impressive, but the results didn’t always look like a single, cohesive scene. The lighting is sometimes off, or different image elements appear at different scales, and you’ll have to do a little more tweaking and editing and reprompting to get everything right.

ChatGupt fares a little better in blending different images and elements together, and changing the overall look of the image. When I tried to get AIS to blend all of my images together into a moody film noir shot, Chetgpt produced something consistent—Gemini’s effort looked too much like a cut-and-paste job.

It can be fun to recreate photos over and over again. Remixing photos of family and friends will be popular, but it’s not all that easy: with people you know, any generative AI that is added will look wrong, because neither ChitGupt nor Gemini knows exactly what these people look like, how they smile, how they stand or tend to sit.

Gemini Images

Gemini can combine images – but they look like different images.
Credit: Gemini

Images of chat gpt

Chetgpt did a better job of creating a new icon that looks right.
Credit: Chet Gpt

In the case of ChatGPT vs. Gemini, they’re now at a higher level—a level that puts cutting-edge Photoshop-style capabilities at everyone’s fingertips. If any AI model has an edge right now, it’s Chat GPT’s, but not much else. It’s also going to be interesting to see where these photo editing capabilities go next.