Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

What are the ‘global models’ of AI, and why do they matter?

Global models, also known as global simulations, are seen by some as the next big thing in AI.

AI pioneer Fei-Fei Li’s Labs Around the World raised $230 million to build “world-class models,” with DeepMind employed one of the developers of the OpenAI animation engine, Sorato work on “international exercises.” (Sora was released on Monday; here are some basics.)

But what? and these things?

Global models are inspired by the world view that people are naturally made of. Our brains take visual cues from our thoughts and process them to better understand the world around us, creating what we call “models” long before AI picks up the words. The predictions our brains make based on these models affect how we see the world.

A paper and AI researchers David Ha and Jürgen Schmidhuber present the example of baseball batsmen. Batters have milliseconds to decide how to swing their bat – shorter than the time it takes for visual signals to reach the brain. The reason he can hit a 100-mile-per-hour fastball is because he can predict where the ball is going, Ha and Schmidhuber say.

“For professional players, all this happens unconsciously,” the two researchers wrote. “Their muscles move the bats at the right time and in the right place based on their internal predictions. They can act quickly on their predictions without needing to elaborate on what will happen in the future to make a plan.”

It is these subtle concepts of the world’s colors that some believe are essential to human intelligence.

Depending on the country

Although the concept has been around for years, international models have recently gained recognition in part because of their inspirational work in the field of video production.

Most, if not all, AI-generated videos fall into the uncanny valley. Stare at them long enough and something amazing it will happen, if the legs overlap and connect with each other.

Although the generative model trained in video years can accurately predict that the ball bounces, they do not really have an idea why – as language models do not really understand the meaning behind the words and sentences. But a global brand that understands why basketball bounces the way it does would do well to demonstrate that it does.

To be able to understand this type, global models are taught in several ways, including images, audio, video, and text, with the aim of creating internal representations of how the world works, and the ability to imagine the consequences of actions. .

Runway Gen-3
An example from the AI ​​version of Runway Gen-3. Image credit:The way to run

“The viewer expects the world they see to behave in the same way as the real one,” said Alex Mashrabov, Snap’s former head of AI and CEO of AI. Higgsfieldwhich is developing models for video production, he said. “If a feather falls as heavy as an elephant or a diving ball hundreds of meters in the air, it vibrates and takes the viewer away immediately. With a powerful global model, instead of the designer specifying how each object is expected to flow – tedious, cumbersome, and time-consuming – the model will understand this. “

But making better movies is just the tip of the iceberg. Researchers including Meta AI chief scientist Yann LeCun say the models could one day be used to predict technology and plan in the digital and physical world.

In a Speaking earlier this year, LeCun explained how a global model can help achieve the desired goal through reasoning. A model with a basic image of the “world” (such as a video of a dirty room), given a goal (a clean room), can come up with a sequence of actions to achieve that goal (put vacuums to sweep, clean. dishes, take out the trash) not because it is a model that he saw but because he knows deeply how to go from dirty to clean.

“We need a machine to understand the world; (Machines) that can remember things, that have intuition, that are rational — things that can think and plan at the same level as humans,” said LeCun. “Despite what you’ve heard from the enthusiasts, current AI systems can’t do that.”

Although LeCun estimates that we are a decade away from the global models he envisioned, global models today show promise as basic science experiments.

The sister of OpenAI Minecraft
Sora controls the player in Minecraft – and gives the world. Image credit:OpenAI

OpenAI writes in a blog post that Sora, which it considers a universal model, can mimic the actions of an artist leaving brushstrokes on a canvas. Examples like Sora – and Sora only – can also be effective compare movie game. For example, Sora can provide a Minecraft-like UI with a game world.

Future models of the world will create a 3D world for the needs of games, photography, and more, World Labs co-founder Justin Johnson said at the event. part on the a16z podcast.

“We have the ability to create unified worlds, but it costs hundreds and hundreds of millions of dollars and development time,” Johnson said. “(World models) don’t just allow you to produce a photo or video, but a well-modeled, fun and interactive 3D world.”

Big obstacles

While the idea is attractive, many technical challenges stand in the way.

Training and running global models requires significant computational power even compared to the amount used by synthetic models. Although some of the latest languages ​​can run on a modern smartphone, Sora (perhaps the world’s first model) would require many GPUs to train and run, especially if their use is widespread.

Global models, like all AI models, too take courage – and establish biases in their training data. International models trained in videos of sunny weather in European cities may struggle to understand or portray Korean cities in snow, for example, or simply misrepresent them.

The lack of information threatens to increase this, says Mashrabov.

He said: “We have seen people of other races just like generations of people of a certain race.” “World-class training tools must be broad enough to cover a wide range of situations, and specific enough that AI can deeply understand those situations.”

Recently postThe CEO of AI founder Runway, Cristóbal Valenzuela, says that data and technical issues prevent today’s models from accurately capturing the behavior of people living in the world (for example humans and animals). “Models will need to create coherent maps of the environment,” he said, “as well as the ability to navigate and interact in those environments.”

OpenAI Sora
A movie made by Sora. Image credit:OpenAI

If all major hurdles are overcome, Mashrabov believes that global models can “tightly” connect AI to the real world – leading to breakthroughs not only in the global generation but also in robotics and AI decision-making.

They can also create highly intelligent robots.

Robots today are limited in what they can do because they have no awareness of the world around them (or their bodies). International models can give them that information, Mashrabov said — to a point.

“With a global model, AI can understand the situation,” he said, “and begin to think about possible solutions.”

TechCrunch has a newsletter focused on AI! Log in here to get it in your inbox every Wednesday.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *