Gemini Robotics-ER 1.5

Our state-of-the-art embodied reasoning model – it specializes in understanding physical spaces, planning, and making logical decisions within its surroundings.

Our Gemini-based multimodal model gives advanced world understanding to robots.


Capabilities

Gemini Robotics-ER 1.5 is capable of making detailed plans from simple commands.

For example, let’s say a human instructed it to ‘clean the kitchen’. The ER model would break down the task into smaller, manageable steps – clear the counter, load the dishwasher, wipe the surfaces. This model also supports thinking.

animation

Orchestration

Orchestrates robot activities, like a high-level brain. Excels at planning and making logical decisions within a physical environment. Interacts in natural language, estimates progress, and can natively call tools – like using Google Search to look for information.

eye_tracking

Advanced spatial understanding

Perceives and understands the surrounding environment to locate and handle objects with greater accuracy.

view_timeline

Temporal reasoning

Understands the cause and effect relationships between objects and actions as they unfold over time.


Performance

Aggregated performance on 15 embodied reasoning academic benchmarks. The benchmarks include: Point-Bench, RefSpatial, RoboSpatial-Pointing, Where2Place, BLINK, CV-Bench, ERQA, EmbSpatial, MindCube, RoboSpatial-VQA, SAT, Cosmos-Reason1, Min Video Pairs, OpenEQA and VSI-Bench.

Bar chart titled 'Aggregated performance on 15 embodied reasoning academic benchmarks', showing Gemini Robotics ER 1.5 with the highest score of 62.8, outperforming GPT-5 (60.6) and Gemini 2.5 Pro (59.3). Bar chart titled 'Aggregated performance on 15 embodied reasoning academic benchmarks', showing Gemini Robotics ER 1.5 with the highest score of 62.8, outperforming GPT-5 (60.6) and Gemini 2.5 Pro (59.3).

Model information

Name
Gemini Robotics-ER 1.5
Input
  • Text
  • Image
  • Video
Output
  • Text
Input tokens
1M
Knowledge cutoff
January 2025
Availability
Public preview
Model card
View model card
Technical report
View technical report