AI
AI
AI
Artificial intelligence startup Generalist AI Inc., a startup focused on embodied robotics intelligence, has released GEN-1, a highly capable foundation model for robot learning and mastery of physical tasks.
The new model, which debuted Friday, arrives merely five months after the company launched GEN-0, a new class of robotics foundation model that allowed the company to train AI models by training directly on raw movement data.
According to the company, GEN-1 represents a tremendous leap forward in robotic intelligence. It allows machines to master tasks rapidly, learn from interactions, react quickly and overcome challenges at rates never seen before. On multiple tasks, it has a success rate that exceeds 99%. It can also execute tasks almost three times faster than current state of the art models and recover from interruptions faster.
The researchers said they worked on improving three core areas: reliability, speed and improvisation.
Although most models can already reliably repeat tasks in the real world, they are limited to task-specific, repetitive motions and suffer reduced complexity by taking on simpler actions. GEN-1 is designed for managing longer step-by-step tasks, such as assembling items, folding multiple pieces of laundry, and other tasks that take complex reasoning over time and space without becoming confused.
Speed is also often an issue with robotics, which often slows down when too many objects are in the field of vision or are moving too quickly. Part of the problem for most models is bringing what it sees to the reasoning engine quickly enough, translating vision to language to training data. This often slows down motion, which can lead to gaps and stutters. As mentioned above, the team managed an almost three-times speedup, resulting in more fluid motions.
For example, the model can assemble a box in around 12.1 seconds, the company said this is around 2.8 times faster than the closest state-of-the-art model in the industry. GEN-0 and pi-0, another well-known robotics intelligence model from Physical Intelligence, took 34 seconds for an identical box.
These two innovations dovetail into the third result, which is the most important: the ability to recover from interruptions, learn from changes in the environment, mistakes and changes. In human terms, this is improvisation.
When something doesn’t completely make sense, a part springs out of a hole, a box misses its mark or a door doesn’t latch, a human normally just goes back and completes the action. An AI could have numerous different reactions, including a forced reassessment, a pattern break or failing to complete the task. It might not even remember how to react to the same event in the future. The human would.
The researchers say that GEN-1 can creatively react to these factors by rapidly adapting to “glitches” in the environment, such as objects slipping, latches failing, items deforming or things not going as planned. It will approach things from different angles, adjust its thinking and try different patterns until something works.
A classic example could be folding a shirt. Getting fabric to go exactly where you want it to is not always an easy feat – it can flop around, curl, warp and wrinkle. Sometimes the shirt will even flip inside-out. When these situations present themselves, the AI will adapt quickly to fix the mess and handle it without creating a worse problem.
The researchers said the model plans and works around its training in ways that are less rigid than its training data. In more human words: It thinks outside the box.
Although the researchers at Generalist had glowing things to say about GEN-1, they added that not all tasks hit the 99% success rate. Some complex tasks couldn’t quite hit that ambitious bar, especially at a reasonable speed and reliability to be useful in everyday settings.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.