Generate, transform and edit images with simple text prompts, or combine multiple images to create something new. All in Gemini.






Capabilities

prompt_spark

Multimodal understanding

Upload images and share text instructions with Gemini to create complex and detailed images.

chat_spark

Conversational inputs

Use everyday language while creating images, and keep the conversation going to refine what the model generates.

globe

Real-world knowledge

Generate images that follow real-world logic, thanks to Gemini’s advanced reasoning capabilities.



Limitations

While Gemini can now create a wide range of images, we’re still working on improving key capabilities.

Factual representation

Not every image Gemini generates will be perfect – it can still struggle with small faces, accurate spelling, and fine details in images.

Character features

The model excels at character consistency, but it may not always get it right. We're working to make this consistency even more reliable.


Safety

We use extensive filtering and data labeling to minimize harmful content in datasets and reduce the likelihood of harmful outputs. We also conduct red teaming and evaluations on content safety, including child safety, and representation.

Image generation in Gemini has all our latest privacy and safety features. This includes SynthID, our tool that embeds an invisible digital watermark directly into an image, allowing it to be identified as AI generated.


Try Gemini Image