Artificial intelligence can produce impressive images, but it isn’t uncommon for these images to have weird problems, such as people with too many teeth or cityscapes with Escher-style street layouts. Google Gemini is working on upgrading its AI image creation feature to fix those sorts of problems, as first spotted in unfinished code by Android Authority. It appears a fine-tuning capability is on its way, which will allow users to make detailed edits to their AI-generated images.
Google Gemini’s text-to-image tools can’t make edits after creating the image right now. Instead, users have to submit new prompts, hoping the new prompt will fix any problems and create something that matches what they want to see. That can be especially tedious if there’s only a small but still distracting error. According to the uncovered code, Gemini’s fine-tuning feature will address the need for limited changes with two editing methods.
The first option will let users submit a prompt about an AI-generated image and ask for a change to one aspect. For instance, if you liked the image above but wanted to set it in a city, you could keep the robot and bird but change the background by asking Gemini to move them. The second method described in the code is a more interactive approach. Users could circle the part of the image they want to change using their finger or a stylus. Once the area is selected, they can describe the desired changes, and Gemini will understand that the instructions pertain only to the circled section.
AI Editing Success
These editing tools could particularly benefit those in fields such as graphic design, marketing, and social media, where visual accuracy and quick turnaround times are crucial. Google Gemini can better serve the needs of artists, designers, and casual users who seek to create polished visual content more efficiently. While the exact release date of these features remains uncertain, their appearance in the code suggests it won’t be long coming. It also pairs well with related features like the upcoming Ask Photos image search feature.
Google won’t be the first to deploy editing tools to AI image makers. These methods are largely the same as those available with OpenAI’s Dall-E portfolio of AI image-making models. In ChatGPT, users can ask for adjustments to an already produced image, or they can highlight parts of it and submit a new text prompt adjusting that part of the picture. There are similar features for many AI image creators like Ideogram.ai and Adobe Firefly. Still, Google’s plan to incorporate these fine-tuning tools is a technical jump for Gemini. It marks Google’s ongoing push to match and surpass its rivals at OpenAI, Meta, and elsewhere when it comes to generative AI tools.