Infinite nature: generating 3D flythroughs from still photos

Machine learning has already had several astonishing achievements, by using still images only. Researchers have now improved the technique to give us an immersive experience of flying through landscapes, which might be applied to video games or virtual reality scenarios.

This was made possible by the Perpetual View Generation (PVG) technology which uses a still image to create a lengthy camera path. The user gets to fly like a bird over detailed and continuously fresh landscapes. Additionally, the user is able to control the camera and select their own path in an interactive manner.

Research problem 

The only issue of the existing PVG techniques is that they cannot produce long sequences of landscapes. “This is a challenging problem that goes far beyond the capabilities of current view synthesis methods, which quickly degenerate when presented with large camera motions” according to a research team from Google Research, Cornell University, and UC Berkeley.

Method and results

To tackle the problem, the team proposes a new hybrid strategy that integrates both geometry and image synthesis in an iterative “render, refine and repeat” approach. 

The “render-refine-repeat” approach

The system renders the input image (left) to a new desired viewpoint (center) which has missing pixels and then a deep network refines this image to produce a new high-quality image (right).

First, an input image is rendered to a new camera view using the disparity. Then the image is refined, by synthesizing and super-resolving missing content. As the output includes both RGB and geometry, this process can be repeated for perpetual view generation.

After training the system on a large dataset of still images, it was capable of producing high-resolution and high-quality flythroughs from a single seed. The results of their research were released in the paper “Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image” and were also presented at the European Conference on Computer Vision (ECCV 2022).

Conclusions

This new hybrid strategy is a promising first step towards producing video sequences with hundreds of frames.

It may have a lot of fascinating potential directions in the future. Some of them include the creation of virtual and augmented reality, urban planning education and entertainment. 3D landscapes fly-through could be used to create interactive and engaging experiences for virtual tourism, theme park attractions and video games.

Learn more:

Other popular posts