This AI Model Can Intuit How the Physical World Works

Subscribe!

Post: This AI Model Can Intuit How the Physical World Works

Recent Posts

This is the only air fryer cleaning trick that actually saved me from scrubbing

March 5, 2026

This stock-market correction signal just triggered for only the third time in seven years. Here’s the message for investors.

March 5, 2026

Jensen Huang says Nvidia is pulling back from OpenAI and Anthropic, but his explanation raises more questions than it answers

March 5, 2026

The Ultimate Guide to COINBASE Crypto App The Best Crypto App for Beginners

March 5, 2026

Fix in Cursor: GitHub PR comment to Cursor prompt in one click

March 4, 2026

The original version K This story I appeared Quanta Magazine.

Here’s a test for toddlers: Show them a glass of water on a desk. Hide it behind the wooden board. Now move the board towards the glass. Are they surprised if the board continues to pass through the glass, as if it is not there? Many 6-month-olds have, and by a year, almost all babies have an intuitive concept of an object’s stability, which is learned through observation. Now some artificial intelligence models do too.

Researchers have developed an AI system that learns about the world through videos and when presented with this information, it displays a perception of “surprise” at what it has achieved.

This model, developed by Meta and called the Video Joint Embedding Prediction Architecture (V-JEPA), makes no assumptions about the physics of the world in the videos. After all, it can start to make sense of how the world works.

“Their claims are, a priori, very plausible, and the results are extremely interesting.” Micah Heilbrona cognitive scientist at the University of Amsterdam who studies how brains and artificial systems perceive the world.

High abstraction

As engineers who build self-driving cars know, getting an AI system to reliably understand what it sees can be difficult. Most systems are designed to “understand” videos by either classifying their content (“a person playing tennis”, for example) or identifying the shape of an object – say, a car in front – in what is called “pixel space”. The model essentially treats every pixel in the video as equally important.

But these pixel-space models come with limitations. Imagine trying to get the feel of a suburban street. If the scene contains cars, traffic lights and trees, the model may focus too much on irrelevant details such as the movement of leaves. It may miss the color of a traffic light, or the positions of nearby cars. “When you go to photos or video, you don’t want to work [pixel] space because there are a lot of details that you don’t want to model,” Randall Balesteroa computer scientist at Brown University.

Post: This AI Model Can Intuit How the Physical World Works

Categories

Recent Posts

This is the only air fryer cleaning trick that actually saved me from scrubbing

This stock-market correction signal just triggered for only the third time in seven years. Here’s the message for investors.

Jensen Huang says Nvidia is pulling back from OpenAI and Anthropic, but his explanation raises more questions than it answers

The Ultimate Guide to COINBASE Crypto App The Best Crypto App for Beginners

Fix in Cursor: GitHub PR comment to Cursor prompt in one click

High abstraction

Related Popular Posts

This is the only air fryer…

This stock-market correction signal just triggered…

Jensen Huang says Nvidia is pulling…

Improveclever

Our Topics

Get interesting news