Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Recently AI examples are amazing as people in their ability to produce text, audio, and video when asked. However, until now these algorithms have been limited to the digital world, rather than the physical, three-dimensional world in which we live. In fact, every time we try to apply these models to the real world, even the most advanced struggle to do it well. —For example, just think how difficult it has been to make good and reliable self-driving cars. Although they are brilliant in fantasy, not only do these models lack knowledge of physics but they are often unrealistic, causing them to make incomprehensible mistakes.
This is the year, however, when AI finally step out of the digital world into the real world we live in. Extending AI beyond its digital boundaries requires a reimagining of the way machines think, combining AI’s digital intelligence with robotics. This is what I call “physical intelligence”, a new form of intelligent machine that can understand changing environments, deal with uncertainty, and make decisions in real time. Unlike the models used by conventional AI, physical intelligence is rooted in physics; by understanding fundamental principles of the real world, such as cause-and-effect.
Such features allow physical intelligence models to adapt and adapt to different environments. In my research group at MIT, we are developing models of physical intelligence that we call fluid networks. In one experiment, for example, we trained two drones—one powered by a standard AI model and the other by a neural network—to find objects in the forest during the summer, using data collected by the pilots. Although both drones performed equally well when given the task of doing what they were trained to do, when asked to find objects in different conditions – in winter or in urban areas – only the liquid drone completed its task better. The experiment showed us that, unlike traditional AI systems that stop evolving after their initial phase, fluid networks continue to learn and adapt from experience, just like humans do.
Physical intelligence can also interpret and implement complex rules that come from text or images, closing the gap between digital instructions and actual execution. For example, in my lab, we have developed an intelligent system that, in less than a minute, can repeatedly design and print small 3D robots based on what they are told as “a robot that can move forward” or “a robot that can handle objects”.
Other labs are also making major breakthroughs. For example, robotics startup Covariant, founded by UC-Berkeley researcher Pieter Abbeel, is developing chatbots—similar to ChatGTP—that can control robotic devices when asked. They have already raised more than $222 million to build and deploy robotics in warehouses around the world. A group at Carnegie Mellon University also recently showed that a robot with only one camera and an unpredictable behavior can perform extraordinary and complex movements—including jumping over obstacles twice its height and crossing gaps twice its height—using a single network trained through reinforcement learning.
If 2023 was the year of writing-taking-picture and 2024 was writing-on-video, then 2025 will mark the era of physical intelligence, with a new generation of devices-not only robots, but also everything from electronic devices to smart homes. – who can interpret what we are telling them and do real work.