NVIDIA AI-Powered DLSS 3 pushes the boundaries of graphics and visual computing technologies

In the recent years NVIDIA has been making significant advances in AI and ML (machine learning), with its powerful GPUs used in a wide range of applications, from scientific research to self-driving cars.

Towards the end of 2018 NVIDIA launched its new Turing architecture, which featured dedicated Tensor Cores for AI & real-time ray tracing capabilities for advanced graphics rendering.

At the end of 2022 NVIDIA has released the GeForce 40 graphics cards, succeeding the GeForce 30 series. The GeForce 40 cards are based on the Ada Lovelace architecture and feature hardware-accelerated RTX (raytracing) with NVIDIA’s third-generation RT (real-time) cores and fourth-generation Tensor Cores.

DLSS 

DLSS (Deep Learning Super Sampling) is an AI-based technology developed by NVIDIA. DLSS leverages the TensorCores present in Turing GPUs to deploy an NN (neural network) that delivers images at high resolutions and elevates image quality to upscaled levels.

NVIDIA DLSS architecture

These technologies served as a basis for NVIDIA to shift from its GTX brand to its RTX brand, DLSS becoming a key feature of the GeForce RTX 20 series cards.

Real-time ray tracing and NVIDIA DLSS

The basic idea behind DLSS is to use AI to upscale lower resolution images to higher resolutions (such as a 1080p to 4K upscaling) while maintaining high frame rates.

The trained DL model takes the lower-resolution image rendered by the game engine and upscales it to a higher-resolution image during gameplay.

DLSS uses two main types of AI algorithms to generate high-quality images:

  • Neural network: At the heart of DLSS is a NN that has been trained on a large dataset of images and learns to identify patterns and features in these images. Once the NN has been trained, it can be used in real-time during gameplay to generate higher-quality images.
  • Spatial-temporal filter: In addition to the NN, DLSS also uses a spatial-temporal filter to further improve image quality. This filter helps to smooth out any jagged edges or visual artifacts (such as aliasing) that may be rendered.

DLSS must be integrated into the game engine by the game developer. This involves providing the engine with a lower-resolution image of the game’s output, which is then used by the DLSS software to generate a higher-resolution image.

Implementation

DLSS has gone through several iterations since its initial release, with each version introducing improvements and new features. Here are the main versions of DLSS and their software architecture.

DLSS 1.0 

This was the first version of DLSS, released in 2018. It used a single NN to upscale the image from a low resolution to a higher one.

  • Reconstruction Stage: The first stage uses the information from the lower-resolution image to create a high-resolution image. The technique involves reconstructing the missing details and smoothing out the jagged edges that can result from the upscale.
  • Upscaling Stage: In this stage DLSS 1.0 uses the single raw with a low-resolution frame and upscales it. As only one frame is being used, the NN must generate a lot of new information to produce the desired output.

The training step was performed on Nvidia’s Saturn V supercomputer using a comprehensive dataset made of thousands of aliased input frames, together with common augmentations such as rotations, color changes, and random noise. The model was trained to upscale an input frame from one sample per pixel to a 64 sample per pixel supersampled image.

How can DLSS achieve this performance? 

To explain a little further, DLSS uses an auto-encoder based on a CNN (convolutional neural network) previously trained to identify and fix temporal artifacts. An auto-encoder neural network is a type of deep learning model that is commonly used in image and signal processing tasks. The NN consists of two parts: an encoder and a decoder.

Structure of an auto-encoder based CNN
  • The encoder takes an input image and performs a series of convolutional operations that reduce the image to a lower-dimensional representation, known as a bottleneck or latent space.
  • The decoder then takes this lower-dimensional representation and performs a series of transposed convolutional operations to reconstruct the original image, attempting to minimize the difference between the reconstructed image and the original input image.

In short, the encoder learns how to compress the original input into a small encoding, while the decoder learns how to restore the original data generated by the encoder.

The network is trained using the backpropagation algorithm, to minimize the reconstruction error between the input and output images.

During the training phase the network learns to map the low-resolution input to the corresponding high-resolution output render. The training dataset includes the raw lowers input, motion vectors, depth buffers, and exposure/brightness information.

During this phase the loss function measures the difference between the predicted high-resolution output and the ground truth.

After the training stage, the encoder can be used as a feature extractor for downstream tasks, such as image classification or object detection.

DLSS 1.0 was integrated in games like Battlefield V and Metro Exodus
However, DLSS 1.0 required its model to be trained for each game individually, which was a time-consuming process. NVIDIA addressed this limitation in later versions of DLSS.

DLSS “1.9” is DLSS 1.0 adapted for running on the CUDA shader cores instead of tensor cores, used for the game “Control”.

DLSS 2.0

This is a TAAU (temporal anti-aliasing upsampling) implementation that uses information from previous frames to improve the quality of the upscaled image through sub-pixel jittering.

This is done by shifting the pixels’ position on each frame. Thus the algorithm can gather more information about the scene and resolve fine details that may have been missed in a single frame.

Note:

TAAU techniques are not upscalers like DLSS 1.0. The TAAU does not create new information from a low-resolution source, but rather recover data from the previous frames. Therefore, it’s recommended that game developers use higher resolution textures.

The advancements offered by DLSS 2.0 over DLSS 1.0, include:

  • sharper and more detailed images
  • the use of a more generalized NN that doesn’t need re-training for each specific game
  • a reduced overhead of approximately 1-2 ms, as compared to DLSS 1.0’s overhead of approximately 2-4 ms.

DLSS 2.1

Released in 2021, DLSS 2.1 added support for VR (virtual reality) games and improved the visual quality of DLSS in certain scenarios, such as when using ray tracing.

DLSS 2.2

DLSS 2.2 introduced a new ultra-performance mode, which enabled even higher frame rates by using a lower-quality upscaling algorithm.

DLSS 3.0

On September 20, 2022 NVIDIA has announced DLSS (Deep Learning Super Sampling) 3, “the next revolution in neural graphics”, which is a combination of neural networks and computer graphics.

NVIDIA DLSS 3.0 architecture

DLSS 3.0 is the next iteration of DLSS. It uses a new generation OFA (Optical Flow Accelerator) that is included in the latest Ada Lovelace generation RTX GPUs.

The new OFA is faster and more accurate than the previous OFA available in Turing and Ampere RTX, which results in a better image quality and performance.

NVIDIA DLSS 3 Multiplies Performance By Up To 4x

According to NVIDIA benchmarks, DLSS 3 multiplies performance by up to 4x over brute-force rendering, offering higher resolutions and graphics settings, while maintaining smooth and responsive gameplay.

The technology has been embraced by the gaming industry, with more than 35 games and applications.

Over 35 games and applications have already started integrating NVIDIA DLSS 3

DLSS 3 operates on the GPU, bypassing CPU bottlenecks and boosting frame rates. However, in order to generate high-quality upscaled images in real time, DLSS 3.0 still requires a significant amount of processing power, both from the graphics card and from the CPU.

While the frame generation process itself may be independent of the CPU, the overall performance of DLSS 3.0 is still affected by the performance of the CPU and other system components.

At the time of release, DLSS 3.0 was not compatible with VR displays. It is worth noting that DLSS technology is continuously evolving and improving, so future updates may address this limitation.

Conclusion

DLSS is a powerful technology that can significantly improve the performance of games while maintaining high image quality. Its software architecture has evolved over time to become more efficient and effective, and we can expect further improvements in future versions.

DLSS has several benefits over traditional rendering methods:

  • higher frame rates, which can improve the overall gaming experience by making the game feel more responsive and fluid
  • it can reduce the workload on the GPU, which can lead to lower power consumption and less heat generation
  • it can improve the visual quality of the game by reducing visual artifacts and making the image appear sharper and clearer

Artificial Intelligence and Deep Learning already play a significant role in graphics and visual computing technologies, and their future developments may include the speeding up the rendering process, enhancing images and videos in various ways, such as removing noise, improving color and contrast, and increasing resolution.

According to Andrew Edelsten, Director of Developer Technologies at NVIDIA, in the future games will be more and more powered by AI, including text2speech systems, chat bots, or cheat detection systems to detect game hacks.

Learn more:

Other popular posts