Google Introduces Gemini Robotics On-Device AI Model, Can Adapt to Different Types of Robots

Published 7 hours ago• 2 minute read

Google DeepMind released a new Gemini Robotics artificial intelligence (AI) model on Tuesday that can run entirely on a local device. Dubbed Gemini Robotics On-Device, it is a voice-language-action (VLA) model that can make robots perform a wide range of tasks in real-world environments. The Mountain View-based tech giant said that since the AI model functions without the need to be connected with a data network, it is more useful for applications that are latency sensitive. Currently, the model is available to those who have signed up for its trusted tester programme.

In a blog post, Carolina Parada, a Senior Director and Head of Robotics at Google DeepMind, announced the release of Gemini Robotics On-Device. The new VLA model can be accessed via a Gemini Robotics software development kit (SDK) after signing up for its tester programme. The model can also be tested on the company's MuJoCo physics simulator.

Since it is a proprietary model, details about its architecture and training methods are not known. However, Google has highlighted its capabilities. The VLA model is designed for bi-arm robots and has minimal computational requirements. Despite that, the model allows for experimentation, and the company claims that it can adapt to new tasks with just 50 to 100 demonstrations.

Gemini Robotics On-Device also adheres to natural language instructions and can perform complex tasks such as unzipping bags or folding clothes. Based on internal testing, the tech giant claims that the AI model “exhibits strong generalisation performance while running entirely locally.” Additionally, it is also said to outperform other on-device models on “more challenging out-of-distribution tasks and complex multi-step instructions.”

Notably, Google highlighted that while the AI model was trained for ALOHA robots, researchers were also able to adapt it to Franka FR3 and Apptronik's Apollo humanoid robots. All of these are bi-arm robots, which is the only configuration that is compatible with Gemini Robotics On-Device.

The AI model was able to adhere to instructions and perform general tasks on all the different robots. Additionally, it could also handle previously unseen objects and scenarios, and even execute industrial belt assembly tasks that require a high level of precision and dexterity, the company claimed.

Origin: