Credit: Apple Depth Pro

Apple unveils Depth Pro, an AI app that can map the depth of a 2D image

10 Oct 2024, 14:04 by Bob Yirka, Tech Xplore · Tech Xplore

A team of engineers at Apple has developed an AI-based model called Depth Pro that can map the depth of a 2D image. The team has written a paper describing the app and its capabilities and has posted it on the arXiv preprint server. They have also posted an announcement regarding the app on the company's Machine Learning Research page.

Humans and other animals are able to perceive depth because the brain is able to take two images, one from each eye, and use the differences between them to figure out which parts of the images are closer and which are more distant. Some video cameras have done something similar to create 3D videos.

Smartphones, because they rely on just one camera for picture taking and video creation, have various hardware and software additions that allow for adding some degree of depth. In this new effort, the engineers at Apple have created an entire depth map using data from the original image without resorting to use of metadata such as camera intrinsics.

A depth map is a map that is created using all the pixels in an original image. Each data-point on the map represents a single pixel and corresponds to the distance between the part of the picture represented by the pixel and the corresponding part of the object that was imaged.

Such a map allows for the addition of another dimension to a flat picture, giving it 3D effects. Creating a depth map, the team suggests, can generate 3D effects that are sharper than those made using standard smartphone techniques.

Overview of the network architecture. Credit: arXiv (2024). DOI: 10.48550/arxiv.2410.02073

In their announcement, the team at Apple claims that apps using the model are capable of producing a depth map in just 0.3 seconds when run on a computer with a standard GPU—and it can do so without the types of camera data that are usually needed to generate 3D effects.

By creating a model that operates so speedily, Apple has opened the door to creating 3D imagery from a single lens camera in real time. And this, the team notes, could have major implications for robots and other real-time mapping applications, such as those used on autonomous vehicles.

More information: Aleksei Bochkovskii et al, Depth Pro: Sharp Monocular Metric Depth in Less Than a Second, arXiv (2024). DOI: 10.48550/arxiv.2410.02073
Depth Pro: github.com/apple/ml-depth-pro
machinelearning.apple.com/research/depth-pro
Journal information: arXiv