Artificial intelligence can produce stunning images, but it’s not uncommon for these images to feature strange issues, like people with too many teeth or cityscapes with Escher-esque street layouts. Google Gemini is working on updating its AI image creation feature to fix those kinds of issues, as first spotted by Android Authority in unfinished code. It looks like a tweaking capability is on the way that will allow users to make detailed edits to their AI-generated images.
Google's Gemini text-to-image tools can't make edits after the image is created right now. Instead, users have to submit new prompts, hoping that the new prompt will fix the issues and create something that matches what they want to see. This can be especially tedious if there's only a small error, but it's still annoying. According to the discovered code, Gemini's fine-tuning feature will address the need to make limited changes with two editing methods.
The first option will allow users to submit a request about an AI-generated image and ask for a change to one aspect. For example, if you liked the previous image but want to place it in a city, you can keep the robot and bird, but change the background by asking Gemini to move them. The second method described in the code is a more interactive approach. Users can circle the part of the image they want to change with a finger or stylus. Once the area is selected, they can describe the desired changes and Gemini will understand that the instructions only refer to the circled section.
Success in editing with AI
These editing tools could prove especially useful for those working in fields like graphic design, marketing, and social media, where visual accuracy and fast response times are crucial. Google Gemini may better meet the needs of artists, designers, and casual users looking to create polished visual content more efficiently. While the exact release date for these features remains uncertain, their appearance in code suggests they're coming soon. It also pairs well with related features like the upcoming Ask Photos image search feature.
Google won’t be the first to roll out editing tools for AI image creators. These methods are largely the same as those available with OpenAI’s Dall-E portfolio of AI image creation models. In ChatGPT, users can request adjustments to an already produced image, or they can highlight parts of it and send a new text message to adjust that part of the image. Similar features exist for many AI image creators like Ideogram.ai and Adobe Firefly. Still, Google’s plan to incorporate these fine-tuning tools is a technical leap for Gemini. It marks Google’s continued effort to match and surpass its rivals at OpenAI, Meta, and elsewhere when it comes to generative AI tools.